0. Introduction

This document lists the exploratory data analysis, model build and analysis for Coasting on Couches, the term project for MGT 6203 Spring semester. Whilst the project document and slides/ presentation list a more human-readable version, this document combines some exposition with a lot of code to generate graphs, to put it simply.

The code and data is available on a GitHub repo.

0.1 Potential Gotchas

  1. Please download the data from the GitHub repo.
  2. You may have trouble installing the following packages via the chunk below. If you do, please manually install them:
    1. pls
    2. geojsonio
  3. Whilst the geojson_read method works in most installations (and had been extensively tested on mac OS Monterey ), it has thrown up errors in some Windows installations. We were made aware of this only yesterday, and unfortunately do not have a resolution at the moment, mainly because this does not happen on our primary platform, mac OS Monterey. We could potentially issue a fix via our GitHub repo should there be a resolution in the future. The attached HTML document shows the generated choropleth maps.

0.2 Library Setup

Please run the following chunk to ensure all the necessary libraries are installed/ present should you wish to execute the chunks at your end. (Idea taken from this blogpost)

Please execute install.packages("pls") in the console if it’s the first time you’re running it. There could be some additional installations, depending on your operating system.

1. Loading Data

We had downloaded data from InsideAirbnb.com to the data folder. This will be available on our GitHub repo.

The data is in four parts (each city has all five elements): 1. listing: These are the actual Airbnb listings. The columns are defined here. This is set of 74 variables pertaining to a specific listing. 2. review: List of reviews per row in the listing table. 3. calendar: Price of a particular listing on a particular date, along with min-max nights for hire 4. neighbourhoods: List of neighbourhoods screened in the city 5. map: GeoJson shapefile showing district boundaries.

The data may be accessed here or here ## 1.1 Raw

We will read all the data into dataset variables.

#Singapore
listing.sin <- read.csv("./data/SIN_listings.csv")
reviews.sin <- read.csv("./data/SIN_reviews.csv")
calendar.sin <- read.csv("./data/SIN_calendar.csv")
neighbourhoods.sin <- read.csv("./data/SIN_neighbourhoods.csv")
map.sin <- geojson_read("./data/SIN_neighbourhoods.geojson")

#Taipei
listing.tpe <- read.csv("./data/TPE_listings.csv")
reviews.tpe <- read.csv("./data/TPE_reviews.csv")
calendar.tpe <- read.csv("./data/TPE_calendar.csv")
neighbourhoods.tpe <- read.csv("./data/TPE_neighbourhoods.csv")
map.tpe <- geojson_read("./data/TPE_neighbourhoods.geojson")

#Tokyo
listing.nrt <- read.csv("./data/NRT_listings.csv")
reviews.nrt <- read.csv("./data/NRT_reviews.csv")
calendar.nrt <- read.csv("./data/NRT_calendar.csv")
neighbourhoods.nrt <- read.csv("./data/NRT_neighbourhoods.csv")
map.nrt <- geojson_read("./data/NRT_neighbourhoods.geojson")

#Hong Kong
listing.hkg <- read.csv("./data/HKG_listings.csv")
reviews.hkg <- read.csv("./data/HKG_reviews.csv")
calendar.hkg <- read.csv("./data/HKG_calendar.csv")
neighbourhoods.hkg <- read.csv("./data/HKG_neighbourhoods.csv")
map.hkg <- geojson_read("./data/HKG_neighbourhoods.geojson")

1.2 Initial Data Analysis

1.2.0 Number of Listings

We first see the number of listings per city.

cities  <- c("Singapore", "Tokyo", "Taipei", "Hong Kong")
no_of_listings <- c(nrow(listing.sin), nrow(listing.nrt), nrow(listing.tpe), nrow(listing.hkg))
no_of_listings.fig <- plot_ly(
  x = cities,
  y = no_of_listings,
  type = "bar", 
  text = no_of_listings
)
no_of_listings.fig <- no_of_listings.fig %>% layout(title ="No of Listings Per City", yaxis = list(title="No of Listings"))
no_of_listings.fig

Clearly, Tokyo has the largest number of listings followed by Hong Kong, Taipei and then Singapore.

1.2.1 Listings by Neighbourhood - Choropleth Maps

Let us also consider heatmaps of where the listings are in each city by neighbourhood. Admittedly, the InsideAirbnb website has a map for each city (here’s one for Taipei), but this does not show by district. At a later stage, this can be enhanced to see by variable or time-series.

generate_choropleth_by_city <- function (listing, map, city_name)
{
  listings_by_neighbourhood <- listing %>%
                                count(neighbourhood_cleansed) 
  # neighbourhoods_zero <- neighbourhoods %>%
  #                       filter(!neighbourhood %in% listings_by_neighbourhood$neighbourhood) %>%
  #                       mutate(n = 0) %>%
  #                       select(neighbourhood, n)
  # listings_by_neighbourhood <- union(listings_by_neighbourhood, neighbourhoods_zero)
  # print(listings_by_neighbourhood)
  g <- list (
    fitbounds = "locations",
    visible = FALSE
  )
  fig <- plot_ly()
  fig <- fig %>% add_trace(
    type="choropleth",
    geojson=map,
    locations=listings_by_neighbourhood$neighbourhood_cleansed,
    z=listings_by_neighbourhood$n,
    colorscale="Viridis",
    featureidkey="properties.neighbourhood"
  )
  
  fig <- fig %>% layout(
    geo = g
  )
  fig <- fig %>% colorbar(title = "No of listings")
  fig <- fig %>% layout(
    title = paste0("Listings by Neighbourhood - ", city_name)
  )
  fig
}
generate_choropleth_by_city(listing.sin, map.sin, "Singapore")
generate_choropleth_by_city(listing.nrt, map.nrt, "Tokyo")
generate_choropleth_by_city(listing.hkg, map.hkg, "Hong Kong")
generate_choropleth_by_city(listing.tpe, map.tpe, "Taipei")
bin_districts <- function(listing, bins)
{
  district_bins <- listing %>%
                    count(neighbourhood_cleansed) %>%
                    arrange(desc(n))%>%
                    mutate(nb_group = ntile(n,n=bins)) %>%
                    arrange(desc(nb_group))
  return(district_bins)
}

bin_districts(listing.sin, 4)
##     neighbourhood_cleansed   n nb_group
## 1                  Kallang 405        4
## 2                  Geylang 340        4
## 3            Downtown Core 317        4
## 4                   Outram 309        4
## 5                   Rochor 270        4
## 6                   Novena 235        4
## 7                    Bedok 186        4
## 8              Bukit Merah 178        4
## 9             River Valley 149        4
## 10              Queenstown 143        4
## 11         Singapore River 108        4
## 12                 Tanglin  90        3
## 13                 Orchard  80        3
## 14                Clementi  72        3
## 15             Jurong East  71        3
## 16             Jurong West  59        3
## 17           Marine Parade  57        3
## 18                  Newton  57        3
## 19             Bukit Timah  56        3
## 20               Woodlands  44        3
## 21                 Hougang  43        3
## 22               Toa Payoh  40        3
## 23                  Bishan  39        2
## 24               Serangoon  35        2
## 25               Pasir Ris  33        2
## 26             Bukit Batok  31        2
## 27                Tampines  30        2
## 28              Ang Mo Kio  26        2
## 29               Sembawang  26        2
## 30                Sengkang  19        2
## 31                  Yishun  19        2
## 32        Southern Islands  18        2
## 33                 Punggol  17        2
## 34           Choa Chu Kang  17        1
## 35                  Museum  15        1
## 36           Bukit Panjang  14        1
## 37 Central Water Catchment  12        1
## 38            Marina South   3        1
## 39 Western Water Catchment   3        1
## 40                  Mandai   2        1
## 41            Lim Chu Kang   1        1
## 42                 Pioneer   1        1
## 43            Sungei Kadut   1        1
## 44                    Tuas   1        1

1.2.2 Listings by Neighbourhood - Bar Charts

We could also see this as barcharts.

bar_charts_by_neighbourhood <- function (listing, city_name, neighbourhoods)
{
  listings_by_neighbourhood <- listing %>%
                                count(neighbourhood_cleansed) %>%
                                # rename(neighbourhood = neighbourhood_cleansed)
                                arrange(desc(n))
  # print(listings_by_neighbourhood)
  neighbourhoods_zero <- neighbourhoods %>%
                        filter(!neighbourhood %in% listings_by_neighbourhood$neighbourhood_cleansed) %>%
                        rename(neighbourhood_cleansed = neighbourhood) %>%
                        mutate(n = 0) %>%
                        select(neighbourhood_cleansed, n)
  print(neighbourhoods_zero)
  listings_by_neighbourhood <- union(listings_by_neighbourhood, neighbourhoods_zero)
  # print(listings_by_neighbourhood)
  fig<- plot_ly(y=listings_by_neighbourhood$neighbourhood_cleansed, x=listings_by_neighbourhood$n, type="bar", orientation="h") %>%
        layout(yaxis=list(categoryorder = "total ascending"), title=paste("Listings per neighbourhood in", city_name))
  fig
}

bar_charts_by_neighbourhood(listing.sin, "Singapore", neighbourhoods.sin)
##    neighbourhood_cleansed n
## 1             Marina East 0
## 2            Straits View 0
## 3                  Changi 0
## 4              Changi Bay 0
## 5              Paya Lebar 0
## 6   North-Eastern Islands 0
## 7                 Seletar 0
## 8                 Simpang 0
## 9                Boon Lay 0
## 10                 Tengah 0
## 11        Western Islands 0
bar_charts_by_neighbourhood(listing.nrt, "Tokyo", neighbourhoods.nrt)
##    neighbourhood_cleansed n
## 1          Aogashima Mura 0
## 2               Fussa Shi 0
## 3           Hachijo Machi 0
## 4       Higashiyamato Shi 0
## 5            Hinode Machi 0
## 6           Hinohara Mura 0
## 7               Inagi Shi 0
## 8              Kiyose Shi 0
## 9          Kozushima Mura 0
## 10        Mikurajima Mura 0
## 11            Miyake Mura 0
## 12           Mizuho Machi 0
## 13           Niijima Mura 0
## 14         Ogasawara Mura 0
## 15           Oshima Machi 0
## 16           Toshima Mura 0
bar_charts_by_neighbourhood(listing.hkg, "Hong Kong", neighbourhoods.hkg)
## [1] neighbourhood_cleansed n                     
## <0 rows> (or 0-length row.names)
bar_charts_by_neighbourhood(listing.tpe, "Taipei", neighbourhoods.tpe)
## [1] neighbourhood_cleansed n                     
## <0 rows> (or 0-length row.names)

The majority of listings in Taipei, Tokyo and Hong Kong are in tourist-heavy districts. While the top district in Tokyo is Shinjuku, that in Hong Kong is Yau Tsim Mong, Kowloon’s core urban area formed by the combination of Yau Ma Tei, Tsim Sha Tsui and Mong Kok. Taipei’s top district is Zhongzheng district (“中正區”), consisting of historic sites and cultural performances.

In contrast to the other three cities, the top district in Singapore is the high-end residential condo district, Kallang. Not tourist heavy, but close to the downtown’s many attractions. The tourist-heavy Geylang, Downtown Core and Outram districts appear after Kallang.

Taipei and Hong Kong have listings in all districts. But 11 districts in Singapore (mostly military installations or the airport) and 16 districts in Tokyo do not have any listings.

1.2.3 Amenities

There’s a column called amenities in the dataset that appears to list all the self-reported amenities in the listing as a single comma-separated list. Let’s try to see this further.

For instance, here’s the longest list of amenities among Singapore listings.

listing_amenities.sin <- listing.sin %>%
                      mutate(amenities = str_replace(amenities, "\\[","")) %>%
                      mutate(amenities = str_replace(amenities, "\\]","")) %>%
                      mutate(amenities = str_replace_all(amenities, "\"","")) %>%
                      mutate(amenities = str_replace_all(amenities, ", " ,",")) %>%
                      mutate(amenities_list = as.list(strsplit(amenities, ","))) %>%
                      mutate(no_of_am = lengths(amenities_list)) %>%
                      mutate(Wifi = as.numeric(grepl('Wifi', amenities, fixed = TRUE))) %>%
                      mutate(Shampoo = as.numeric(grepl('Shampoo', amenities, fixed = TRUE))) %>%
                      mutate(Kitchen = as.numeric(grepl('Kitchen', amenities, fixed = TRUE))) 

# listing_amenities.sin %>% select(amenities, Wifi, Shampoo, Kitchen, Patio)

max_amenities.sin <- listing_amenities.sin %>%
                      select(amenities, no_of_am) %>%
                      group_by() %>%
                     slice(which.max(no_of_am))
amenities_list_string <- as.list(strsplit(as.character(max_amenities.sin["amenities"]), ","))
amenities_list_string
## [[1]]
##  [1] "Toaster"                         "Sound system"                   
##  [3] "Safe"                            "Indoor fireplace"               
##  [5] "Backyard"                        "Hangers"                        
##  [7] "Bed linens"                      "Hot water kettle"               
##  [9] "Freezer"                         "Coffee maker"                   
## [11] "Washer"                          "Cooking basics"                 
## [13] "Bathtub"                         "Hair dryer"                     
## [15] "Clothing storage"                "Oven"                           
## [17] "Outdoor furniture"               "Paid parking on premises"       
## [19] "High chair"                      "Children\\u2019s books and toys"
## [21] "Dedicated workspace"             "Crib"                           
## [23] "Dining table"                    "Free parking on premises"       
## [25] "Pool"                            "Cleaning products"              
## [27] "Wine glasses"                    "Cleaning before checkout"       
## [29] "Game console"                    "Long term stays allowed"        
## [31] "Drying rack for clothing"        "Outdoor dining area"            
## [33] "Private entrance"                "Elevator"                       
## [35] "Patio or balcony"                "Refrigerator"                   
## [37] "Dryer"                           "Microwave"                      
## [39] "Baby bath"                       "Gym"                            
## [41] "Wifi"                            "Children\\u2019s dinnerware"    
## [43] "Smoke alarm"                     "Board games"                    
## [45] "Luggage dropoff allowed"         "Shampoo"                        
## [47] "Breakfast"                       "Extra pillows and blankets"     
## [49] "Heating"                         "Conditioner"                    
## [51] "Cable TV"                        "Hot tub"                        
## [53] "Hot water"                       "Stove"                          
## [55] "Body soap"                       "BBQ grill"                      
## [57] "Iron"                            "Essentials"                     
## [59] "Babysitter recommendations"      "Pack \\u2019n play/Travel crib" 
## [61] "Kitchen"                         "Changing table"                 
## [63] "First aid kit"                   "Dishes and silverware"          
## [65] "Air conditioning"                "Shower gel"                     
## [67] "TV with standard cable"          "Fire extinguisher"
#"Shampoo,Kitchen,Long term stays allowed,Washer,Smart lock,Hair dryer,Dryer,Wifi,Hot water,TV,Air conditioning,Smoke alarm,Fire extinguisher"

Apropos nothing, we will use the following amenities as dummy variables for price: > “Shampoo,Kitchen,Long term stays allowed,Washer,Hair dryer,Wifi,Hot water,TV,Air conditioning”

1.2.4 Host Verifications

Similarly, let us also further analyse the column host_verifications to see if we can generate dummy variables from there as well.

listing_host_verf.sin <- listing.sin %>%
                      mutate(host_verifications = str_replace(host_verifications, "\\[","")) %>%
                      mutate(host_verifications = str_replace(host_verifications, "\\]","")) %>%
                      mutate(host_verifications = str_replace_all(host_verifications, "\"","")) %>%
                      mutate(host_verifications = str_replace_all(host_verifications, ", " ,",")) %>%
                      mutate(host_verifications_list = as.list(strsplit(host_verifications, ","))) %>%
                      mutate(no_of_vf = lengths(host_verifications_list))

max_verf.sin <- listing_host_verf.sin %>%
                      select(host_verifications, no_of_vf) %>%
                      group_by() %>%
                     slice(which.max(no_of_vf))
host_verf_list_string <- as.list(strsplit(as.character(max_verf.sin["host_verifications"]), ","))
host_verf_list_string
## [[1]]
##  [1] "'email'"                 "'phone'"                
##  [3] "'facebook'"              "'google'"               
##  [5] "'reviews'"               "'jumio'"                
##  [7] "'offline_government_id'" "'selfie'"               
##  [9] "'government_id'"         "'identity_manual'"      
## [11] "'work_email'"

Let’s take this list to generate dummy variables. > [‘email’, ‘phone’, ‘facebook’, ‘reviews’, ‘manual_offline’, ‘jumio’, ‘offline_government_id’, ‘government_id’, ‘work_email’]

Let’s generalise these two bits for all cities and create dummy variables for each one of them.

wrangle_amenities_hostvf <- function (listing)
{
  listing <- listing %>%
            mutate(amenities = str_replace(amenities, "\\[","")) %>%
            mutate(amenities = str_replace(amenities, "\\]","")) %>%
            mutate(amenities = str_replace_all(amenities, "\"","")) %>%
            mutate(amenities = str_replace_all(amenities, ", " ,",")) %>%
            mutate(amenities_list = as.list(strsplit(amenities, ","))) %>%
            mutate(no_of_am = lengths(amenities_list)) %>%
            mutate(Amenities_Wifi = as.numeric(grepl('Wifi', amenities, fixed = TRUE))) %>%
            mutate(Amenities_Shampoo = as.numeric(grepl('Shampoo', amenities, fixed = TRUE))) %>%
            mutate(Amenities_Kitchen = as.numeric(grepl('Kitchen', amenities, fixed = TRUE))) %>%
            mutate(Amenities_Long_Term = as.numeric(grepl('Long term stays', amenities, fixed = TRUE))) %>%
            mutate(Amenities_Washer = as.numeric(grepl('Washer', amenities, fixed = TRUE))) %>%
            mutate(Amenities_HairDryer = as.numeric(grepl('Hair dryer', amenities, fixed = TRUE))) %>%
            mutate(Amenities_HotWater = as.numeric(grepl('Hot water', amenities, fixed = TRUE))) %>%
            mutate(Amenities_TV = as.numeric(grepl('TV', amenities, fixed = TRUE))) %>%
            mutate(Amenities_AC = as.numeric(grepl('Air conditioning', amenities, fixed = TRUE))) %>%

            mutate(host_verifications = str_replace(host_verifications, "\\[","")) %>%
            mutate(host_verifications = str_replace(host_verifications, "\\]","")) %>%
            mutate(host_verifications = str_replace_all(host_verifications, "\"","")) %>%
            mutate(host_verifications = str_replace_all(host_verifications, ", " ,",")) %>%
            mutate(host_verifications_list = as.list(strsplit(host_verifications, ","))) %>%
            mutate(hv_email = as.numeric(grepl('email', host_verifications, fixed = TRUE))) %>%
            mutate(hv_phone = as.numeric(grepl('phone', host_verifications, fixed = TRUE))) %>%
            mutate(hv_facebook = as.numeric(grepl('facebook', host_verifications, fixed = TRUE))) %>%
            mutate(hv_reviews = as.numeric(grepl('reviews', host_verifications, fixed = TRUE))) %>%
            mutate(hv_manual_offline = as.numeric(grepl('manual_offline', host_verifications, fixed = TRUE))) %>%
            mutate(hv_manual_jumio = as.numeric(grepl('jumio', host_verifications, fixed = TRUE))) %>%
            mutate(hv_manual_off_gov = as.numeric(grepl('offline_government_id', host_verifications, fixed = TRUE))) %>%
            mutate(hv_manual_gov = as.numeric(grepl('government_id', host_verifications, fixed = TRUE))) %>%
            mutate(hv_manual_work_email = as.numeric(grepl('work_email', host_verifications, fixed = TRUE))) %>%
            mutate(no_of_vf = lengths(host_verifications_list))
}

listing.sin <- wrangle_amenities_hostvf(listing.sin)
listing.nrt <- wrangle_amenities_hostvf(listing.nrt)
listing.tpe <- wrangle_amenities_hostvf(listing.tpe)
listing.hkg <- wrangle_amenities_hostvf(listing.hkg)

1.3 Data Wrangling

1.3.1 Wrangling Listings

This is a function that wrangles AirBnb data into an analysable chunk. Because we will be doing the same for multiple cities, we will do a function out of this. The function is based on top of code shared in the lecture for Module 2. The obvious additions are the id column, neighbourhoods and dummy variables for amenities and host verification.

wrangle_airbnb_dataset <- function (raw_listing_full)
{
  listing.raw <- raw_listing_full  %>% 
                select(id, price,number_of_reviews,beds,bathrooms,accommodates,reviews_per_month, property_type, room_type, review_scores_rating, neighbourhood_cleansed, host_response_time, host_response_rate, host_acceptance_rate, host_is_superhost, latitude, longitude, amenities, last_review, no_of_am, Amenities_Wifi, Amenities_Shampoo, Amenities_Kitchen, Amenities_Long_Term, Amenities_Washer, Amenities_HairDryer, Amenities_HotWater, Amenities_TV,Amenities_AC, host_verifications, hv_email,hv_phone, hv_facebook, hv_reviews, hv_manual_offline, hv_manual_jumio,hv_manual_off_gov, hv_manual_gov, hv_manual_work_email, no_of_vf) %>% 
                rename(Reviews = number_of_reviews) %>% 
                rename(Beds = beds) %>% 
                rename(Baths = bathrooms) %>% 
                rename(Capacity = accommodates) %>% 
                rename(Monthly_Reviews = reviews_per_month) %>% 
                rename(Property_Type = property_type) %>% 
                rename(Room_Type = room_type) %>% 
                rename(Price = price) %>% 
                rename(Rating = review_scores_rating) %>%
                # rename(Neighbourhood = neighbourhood_cleansed) %>%
                rename(host_Superhost = host_is_superhost)


  listing.raw <-  listing.raw %>% 
                mutate(Price = str_replace(Price, "[$]", "")) %>% 
                mutate(Price = str_replace(Price, "[,]", "")) %>% 
                mutate(Price = as.numeric(Price)) %>% 
                
                # mutate(hood_factor = as.factor(Neighbourhood)) %>%
                
                mutate(host_response_rate = str_replace(host_response_rate, "[%]", "")) %>%
                mutate(host_response_rate = as.numeric(host_response_rate)/100) %>%
                mutate(host_acceptance_rate = str_replace(host_acceptance_rate, "[%]", "")) %>%
                mutate(host_acceptance_rate = as.numeric(host_acceptance_rate)/100) %>%
                mutate(host_Superhost = ifelse(host_Superhost =="f", 0, 1)) %>%
    
                mutate(host_response_rate = factor(host_response_rate, levels = c("within a few hours", "within a day", "a few days or more"))) %>%
                mutate(host_response_hours = ifelse(host_response_rate == "within a few hours"),1,0) %>%
                mutate(host_response_day = ifelse(host_response_rate == "within a day"),1,0) %>%
                mutate(host_response_few_days = ifelse(host_response_rate == "a few days or more"),1,0) %>%
                
                mutate(last_review = as.Date(last_review)) %>%
                mutate(Days_since_last_review = as.numeric(difftime(as.Date("2021-12-31"), last_review, units="days"))) %>%
    
                mutate(Room_Type = factor(Room_Type, levels = c("Shared room", "Private room", "Entire home/apt"))) %>% 
                mutate(Capacity_Sqr = Capacity * Capacity) %>% 
                mutate(Beds_Sqr = Beds * Beds) %>% 
                mutate(Baths_Sqr = Baths * Baths) %>% 
                mutate(ln_Price = log(1+Price)) %>% 
                mutate(ln_Beds = log(1+Beds)) %>%
                mutate(ln_Baths = log(1+Baths)) %>% 
                mutate(ln_Capacity = log(1+Capacity)) %>% 
                mutate(ln_Rating = log(1+Rating)) %>% 
                mutate(Shared_ind = ifelse(Room_Type == "Shared room",1,0)) %>% 
                mutate(House_ind = ifelse(Room_Type == "Entire home/apt",1,0)) %>% 
                mutate(Private_ind = ifelse(Room_Type == "Private room",1,0)) %>% 
                mutate(Capacity_x_Shared_ind = Shared_ind * Capacity) %>% 
                mutate(H_Cap = House_ind * Capacity) %>% 
                mutate(P_Cap = Private_ind * Capacity) %>% 
                mutate(ln_Capacity_x_Shared_ind = Shared_ind * ln_Capacity) %>% 
                mutate(ln_Capacity_x_House_ind = House_ind * ln_Capacity) %>% 
                mutate(ln_Capacity_x_Private_ind = Private_ind * ln_Capacity) %>%
                
                filter(!is.na(Price))

  return(listing.raw)
}

list.sin <- wrangle_airbnb_dataset(listing.sin)
list.nrt <- wrangle_airbnb_dataset(listing.nrt)
list.tpe <- wrangle_airbnb_dataset(listing.tpe)
list.hkg <- wrangle_airbnb_dataset(listing.hkg)

1.3.2 Wrangling Reviews.

There’s value in understanding how many reviews a property has received in the last 12 months as a measure of how active a property is. The notion is that modelling price for active listings will be more accurate than modelling price for all listings.

The approach taken in this paper was to look listings active in the past 12 months. However, given restrictions because of pandemic, we felt it would be better to look at the period between 1 Jan 2019 and 31 Dec 2021, to include one year in addition to the two pandemic years, 2020 and 2021.

We check this by wrangling the review dataset.

count_reviews <- function(listings, reviews, from_date, to_date)
{
  reviews_grouped <- reviews %>%
                  mutate(date = as.Date(date)) %>%
                  filter(between(date, as.Date(from_date), as.Date(to_date))) %>%
                  group_by(listing_id) %>%
                  summarise(reviews_since_2019 = n()) %>%
                  mutate(bookings_since_2019 = reviews_since_2019*2) %>%
                  rename(id = listing_id)
  listings <- left_join(listings, reviews_grouped, by="id")
  return(listings)
}
start_date = "2019-1-1"
end_date = "2021-12-31"
list.sin <-count_reviews(list.sin, reviews.sin,start_date, end_date)
list.hkg <- count_reviews(list.hkg, reviews.hkg,start_date, end_date)
list.tpe <- count_reviews(list.tpe, reviews.tpe,start_date, end_date)
list.nrt <- count_reviews(list.nrt, reviews.nrt,start_date, end_date)

list_after_2019.sin <- list.sin %>% filter(!is.na(reviews_since_2019)) 
list_after_2019.tpe <- list.tpe %>% filter(!is.na(reviews_since_2019))
list_after_2019.nrt <- list.nrt %>% filter(!is.na(reviews_since_2019))
list_after_2019.hkg <- list.hkg %>% filter(!is.na(reviews_since_2019))

Let’s try to look listings after 2019 in map form.

generate_choropleth_by_city(list_after_2019.sin, map.sin, "Singapore")
generate_choropleth_by_city(list_after_2019.nrt, map.nrt, "Tokyo")
generate_choropleth_by_city(list_after_2019.hkg, map.hkg, "Hong Kong")
generate_choropleth_by_city(list_after_2019.tpe, map.tpe, "Taipei")
bar_charts_by_neighbourhood(list_after_2019.sin, "Singapore", neighbourhoods.sin)
##     neighbourhood_cleansed n
## 1              Marina East 0
## 2             Straits View 0
## 3                   Changi 0
## 4               Changi Bay 0
## 5               Paya Lebar 0
## 6    North-Eastern Islands 0
## 7                  Seletar 0
## 8                  Simpang 0
## 9             Sungei Kadut 0
## 10                Boon Lay 0
## 11                 Pioneer 0
## 12                  Tengah 0
## 13                    Tuas 0
## 14         Western Islands 0
## 15 Western Water Catchment 0
bar_charts_by_neighbourhood(list_after_2019.nrt, "Tokyo", neighbourhoods.nrt)
##    neighbourhood_cleansed n
## 1          Aogashima Mura 0
## 2               Fussa Shi 0
## 3           Hachijo Machi 0
## 4       Higashiyamato Shi 0
## 5            Hinode Machi 0
## 6           Hinohara Mura 0
## 7               Inagi Shi 0
## 8              Kiyose Shi 0
## 9          Kozushima Mura 0
## 10        Mikurajima Mura 0
## 11            Miyake Mura 0
## 12           Mizuho Machi 0
## 13           Niijima Mura 0
## 14         Ogasawara Mura 0
## 15           Oshima Machi 0
## 16           Toshima Mura 0
bar_charts_by_neighbourhood(list_after_2019.hkg, "Hong Kong", neighbourhoods.hkg)
## [1] neighbourhood_cleansed n                     
## <0 rows> (or 0-length row.names)
bar_charts_by_neighbourhood(list_after_2019.tpe, "Taipei", neighbourhoods.tpe)
## [1] neighbourhood_cleansed n                     
## <0 rows> (or 0-length row.names)
add_earnings <- function(listing)
{
  return (listing %>% mutate(earnings_since_2019 = bookings_since_2019 * 3 * Price))
}
list_after_2019.sin <- add_earnings(list_after_2019.sin) 
list_after_2019.tpe <- add_earnings(list_after_2019.tpe) 
list_after_2019.nrt <- add_earnings(list_after_2019.nrt) 
list_after_2019.hkg <- add_earnings(list_after_2019.hkg) 

Let’s group listings into groups of neighbourhoods: extremely popular, popular, moderate, not so popular, and sparse.

district_bins.sin <- bin_districts(list_after_2019.sin, bins=5)
district_bins.sin
##     neighbourhood_cleansed   n nb_group
## 1                  Geylang 224        5
## 2                  Kallang 221        5
## 3                   Outram 201        5
## 4                   Rochor 136        5
## 5            Downtown Core 134        5
## 6                    Bedok 101        5
## 7              Bukit Merah  94        5
## 8                   Novena  73        5
## 9             River Valley  67        4
## 10              Queenstown  52        4
## 11                 Tanglin  38        4
## 12         Singapore River  36        4
## 13             Jurong West  32        4
## 14           Marine Parade  27        4
## 15             Jurong East  24        4
## 16                 Orchard  24        4
## 17                  Newton  22        3
## 18               Woodlands  22        3
## 19             Bukit Timah  21        3
## 20               Serangoon  21        3
## 21                Clementi  17        3
## 22                  Bishan  16        3
## 23                Tampines  16        3
## 24                 Hougang  15        3
## 25               Toa Payoh  13        2
## 26                  Museum   9        2
## 27                 Punggol   9        2
## 28              Ang Mo Kio   8        2
## 29                  Yishun   8        2
## 30           Choa Chu Kang   7        2
## 31 Central Water Catchment   6        2
## 32               Pasir Ris   6        2
## 33             Bukit Batok   6        1
## 34           Bukit Panjang   6        1
## 35                Sengkang   5        1
## 36        Southern Islands   3        1
## 37            Marina South   2        1
## 38            Lim Chu Kang   1        1
## 39                  Mandai   1        1
## 40               Sembawang   1        1
district_bins.nrt <- bin_districts(list_after_2019.nrt, bins=5)
district_bins.nrt
##    neighbourhood_cleansed    n nb_group
## 1             Shinjuku Ku 1653        5
## 2                Taito Ku 1147        5
## 3               Sumida Ku  810        5
## 4              Toshima Ku  707        5
## 5              Shibuya Ku  509        5
## 6                  Ota Ku  357        5
## 7               Minato Ku  342        5
## 8                 Chuo Ku  330        5
## 9               Nakano Ku  255        5
## 10            Setagaya Ku  251        4
## 11          Katsushika Ku  216        4
## 12                Kita Ku  181        4
## 13            Suginami Ku  177        4
## 14             Arakawa Ku  153        4
## 15           Shinagawa Ku  137        4
## 16                Koto Ku  136        4
## 17             Edogawa Ku  135        4
## 18            Itabashi Ku  110        4
## 19             Chiyoda Ku  110        3
## 20              Bunkyo Ku  106        3
## 21              Adachi Ku   73        3
## 22              Meguro Ku   47        3
## 23              Nerima Ku   44        3
## 24           Hachioji Shi   18        3
## 25               Hino Shi   15        3
## 26            Machida Shi   14        3
## 27              Chofu Shi   12        3
## 28              Fuchu Shi   11        2
## 29          Kokubunji Shi   10        2
## 30             Mitaka Shi    9        2
## 31            Akiruno Shi    7        2
## 32    Higashimurayama Shi    7        2
## 33          Kunitachi Shi    7        2
## 34          Musashino Shi    7        2
## 35          Tachikawa Shi    7        2
## 36               Tama Shi    7        2
## 37         Nishitokyo Shi    6        1
## 38            Kodaira Shi    5        1
## 39                Ome Shi    5        1
## 40              Komae Shi    4        1
## 41             Hamura Shi    3        1
## 42    Musashimurayama Shi    3        1
## 43          Okutama Machi    3        1
## 44           Akishima Shi    2        1
## 45      Higashikurume Shi    2        1
## 46            Koganei Shi    2        1
district_bins.hkg <- bin_districts(list_after_2019.hkg, bins=5)
district_bins.hkg
##    neighbourhood_cleansed    n nb_group
## 1           Yau Tsim Mong 1208        5
## 2                Wan Chai  311        5
## 3       Central & Western  279        5
## 4                 Islands  211        4
## 5            Kowloon City   96        4
## 6                 Eastern   70        4
## 7               Yuen Long   66        3
## 8                   North   65        3
## 9                Sai Kung   47        3
## 10           Sham Shui Po   34        3
## 11                Sha Tin   24        2
## 12               Southern   24        2
## 13                 Tai Po   19        2
## 14               Tuen Mun   12        2
## 15              Kwun Tong    9        1
## 16              Tsuen Wan    4        1
## 17             Kwai Tsing    3        1
## 18           Wong Tai Sin    3        1
district_bins.tpe <- bin_districts(list_after_2019.tpe, bins=5)
district_bins.tpe
##    neighbourhood_cleansed   n nb_group
## 1                  萬華區 536        5
## 2                  中正區 475        5
## 3                  大安區 454        4
## 4                  中山區 407        4
## 5                  信義區 283        3
## 6                  大同區 153        3
## 7                  松山區 142        2
## 8                  士林區 119        2
## 9                  文山區  64        2
## 10                 北投區  54        1
## 11                 內湖區  49        1
## 12                 南港區  24        1
list_after_2019.sin <- left_join(list_after_2019.sin, district_bins.sin %>% select(neighbourhood_cleansed, nb_group), by="neighbourhood_cleansed")
list_after_2019.nrt <- left_join(list_after_2019.nrt, district_bins.nrt %>% select(neighbourhood_cleansed, nb_group), by="neighbourhood_cleansed")
list_after_2019.hkg <- left_join(list_after_2019.hkg, district_bins.hkg %>% select(neighbourhood_cleansed, nb_group), by="neighbourhood_cleansed")
list_after_2019.tpe <- left_join(list_after_2019.tpe, district_bins.tpe %>% select(neighbourhood_cleansed, nb_group), by="neighbourhood_cleansed")
# list_after_2019.sin
list_after_2019.sin <- dummy_cols(list_after_2019.sin, select_columns = "nb_group", remove_selected_columns = TRUE)
list_after_2019.nrt <- dummy_cols(list_after_2019.nrt, select_columns = "nb_group", remove_selected_columns = TRUE)
list_after_2019.tpe <- dummy_cols(list_after_2019.tpe, select_columns = "nb_group", remove_selected_columns = TRUE)
list_after_2019.hkg <- dummy_cols(list_after_2019.hkg, select_columns = "nb_group", remove_selected_columns = TRUE)

list_after_2019.sin_remove <- dummy_cols(list_after_2019.sin, select_columns = c("Property_Type","Room_Type"), remove_selected_columns = TRUE)
list_after_2019.nrt_remove <- dummy_cols(list_after_2019.nrt, select_columns = c("Property_Type","Room_Type"), remove_selected_columns = TRUE)
list_after_2019.tpe_remove <- dummy_cols(list_after_2019.tpe, select_columns = c("Property_Type","Room_Type"), remove_selected_columns = TRUE)
list_after_2019.hkg_remove <- dummy_cols(list_after_2019.hkg, select_columns = c("Property_Type","Room_Type"), remove_selected_columns = TRUE)

# list_after_2019.sin
# list_after_2019.nrt
# list_after_2019.hkg
# list_after_2019.tpe

This reduces the number of listings, and hopefully, quite a few outliers.

cities  <- c("Singapore", "Tokyo", "Taipei", "Hong Kong")
no_of_listings <- c(nrow(listing.sin), nrow(listing.nrt), nrow(listing.tpe), nrow(listing.hkg))
no_of_listings_after_2019 <- c(nrow(list_after_2019.sin), nrow(list_after_2019.nrt), nrow(list_after_2019.tpe), nrow(list_after_2019.hkg))
data <- data.frame(cities, no_of_listings, no_of_listings_after_2019)
no_of_listings.fig <- plot_ly(data, 
  x = cities,
  y = ~no_of_listings,
  type = "bar", 
  text = no_of_listings,
  name = "No of Listings (All Years)"
)
no_of_listings.fig <- no_of_listings.fig %>% add_trace(y = ~no_of_listings_after_2019, text= no_of_listings_after_2019, name = "No of Active Listings")
no_of_listings.fig <- no_of_listings.fig %>% layout(title ="No of Listings Per City", yaxis = list(title="No of Listings"))
no_of_listings.fig

This roughly halves the number of listings being considered in Hong Kong and Singapore, but not in Taipei and Tokyo.

2. Variable Selection: Exploratory Data Analysis

We now attempt to check on variables for each city.

2.1 Data Exploration

2.1.1 Singapore

data_exploration <- function (listing)
{
  plot_str(listing, type="r")
  introduce(listing)
  plot_intro(listing)
  plot_missing(listing)
  plot_bar(listing)
  pca_df <- na.omit(list.sin[, c("Price", "Room_Type", "Reviews", "Beds", "Capacity", "Monthly_Reviews", "host_Superhost", "Rating")])#,"Days_since_last_review", "host_response_rate", "host_response_hours", "host_acceptance_rate","host_response_day", "host_response_few_days")])
  plot_qq(pca_df)
  plot_prcomp(pca_df, variance_cap = 0.9, nrow = 2L, ncol=2L)
}
data_exploration(list.sin)

## 4 columns ignored with more than 50 categories.
## Property_Type: 51 categories
## amenities: 2815 categories
## last_review: 962 categories
## host_verifications: 140 categories

### 2.1.2 Taipei

data_exploration(list.tpe)

## 4 columns ignored with more than 50 categories.
## Property_Type: 58 categories
## amenities: 3376 categories
## last_review: 1050 categories
## host_verifications: 158 categories

2.1.3 Hong Kong

data_exploration(list.hkg)

## 4 columns ignored with more than 50 categories.
## Property_Type: 69 categories
## amenities: 3846 categories
## last_review: 1125 categories
## host_verifications: 150 categories

2.1.4 Tokyo

data_exploration(list.nrt)

## 4 columns ignored with more than 50 categories.
## Property_Type: 64 categories
## amenities: 7467 categories
## last_review: 987 categories
## host_verifications: 192 categories

2.2 Boxplots

We will now check out outliers in our data for various parameters, filtering for listings that have seen at least one booking since 1 Jan 2019, starting with Singapore data.

generate_price_boxplot <- function (listing.clean, city, comparison_col = "")
{
  # png(file = "./graphs/boxplot.png")
  if (comparison_col == "")
  {
    boxplot(listing.clean$Price, data = listing.clean, ylab="Price", main=paste("Boxplot: Price for", city))
  }
  else
    boxplot(listing.clean$Price ~ listing.clean[[comparison_col]], data = listing.clean, ylab="Price", xlab=comparison_col, main=paste("Boxplot: Price vs", comparison_col, "for", city))
  # dev.off()
}

generate_price_boxplot(list_after_2019.sin, "Singapore") #, sin_listing.clean$)

generate_price_boxplot(list_after_2019.sin, "Singapore", "Room_Type") #, sin_listing.clean$)

generate_price_boxplot(list_after_2019.sin, "Singapore", "Property_Type") #, sin_listing.clean$)

generate_price_boxplot(list_after_2019.sin, "Singapore", "Capacity") #, sin_listing.clean$)

generate_price_boxplot(list_after_2019.sin, "Singapore", "Beds") #, sin_listing.clean$)

generate_price_boxplot(list_after_2019.sin, "Singapore", "neighbourhood_cleansed") #, sin_listing.clean$)

generate_price_boxplot(list_after_2019.sin, "Singapore", "Reviews") #, sin_listing.clean$)

Seems like a single boat in Bukit Merah area (possibly next to the marina at Keppel Bay) has a very high price, at $2500/ night. Let’s look that one up more closely.

# head(list_after_2019.sin %>% arrange(desc(Price)))
# head(list_after_2019.sin %>% arrange(desc(reviews_since_2019)))
head(list_after_2019.sin %>% filter(Property_Type == "Boat") %>% arrange (desc(reviews_since_2019)))
##         id Price Reviews Beds Baths Capacity Monthly_Reviews Property_Type
## 1 31527262   344     217    1    NA        2            6.24          Boat
## 2 37907711   199     177    3    NA        4            6.23          Boat
## 3 20247516  2500      55    4    NA        5            1.05          Boat
## 4 50433019   288       7    2    NA        5            2.50          Boat
##         Room_Type Rating neighbourhood_cleansed host_response_time
## 1 Entire home/apt   4.94       Southern Islands     within an hour
## 2 Entire home/apt   4.49                Punggol     within an hour
## 3 Entire home/apt   4.74            Bukit Merah       within a day
## 4 Entire home/apt   5.00                Punggol within a few hours
##   host_response_rate host_acceptance_rate host_Superhost latitude longitude
## 1               <NA>                 1.00              1  1.24535  103.8387
## 2               <NA>                 0.96              0  1.41585  103.9001
## 3               <NA>                 0.98              0  1.26520  103.8190
## 4               <NA>                 0.92              0  1.41480  103.8986
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           amenities
## 1                                                                                                                                                                                  Toaster,Sound system,Hangers,Bed linens,Hot water kettle,Coffee maker,Carbon monoxide alarm,Hair dryer,TV,Outdoor furniture,Dining table,Security cameras on property,Outdoor dining area,Private entrance,Refrigerator,Microwave,Waterfront,Wifi,Smoke alarm,Shampoo,Portable fans,Paid parking off premises,Hot water,Mini fridge,Essentials,First aid kit,Dishes and silverware,Air conditioning,Shower gel,Fire extinguisher
## 2                                                                                                                                                                                                                 Sound system,Hangers,Bed linens,Coffee maker,Hair dryer,TV,Paid parking on premises,Lockbox,Room-darkening shades,Security cameras on property,Long term stays allowed,Patio or balcony,Pour-over coffee,Microwave,Waterfront,Wifi,Smoke alarm,Shampoo,Extra pillows and blankets,Hot water,Mini fridge,Essentials,Kitchen,EV charger,First aid kit,Air conditioning,Shower gel,Fire extinguisher
## 3                                                                                                                                                                                                                                                                                                                                                                                                                             Shampoo,Essentials,Long term stays allowed,Carbon monoxide alarm,Hair dryer,Host greets you,Hot water,Pool,TV,Paid parking on premises,Air conditioning,Smoke alarm,Fire extinguisher
## 4 Toaster,Bidet,Sound system,Rice maker,Hangers,Bed linens,Hot water kettle,Cooking basics,Freezer,Washer,Bathtub,Hair dryer,TV,Outdoor furniture,Dining table,Dedicated workspace,Free parking on premises,Lockbox,Cleaning products,Clothing storage: closet,dresser,and wardrobe,Long term stays allowed,Outdoor dining area,Private entrance,Pocket wifi,Refrigerator,Waterfront,Wifi,Smoke alarm,Induction stove,Dishwasher,Extra pillows and blankets,Portable fans,Hot water,Mini fridge,Iron,Essentials,Boat slip,Kitchen,EV charger,First aid kit,Air conditioning,Dishes and silverware,Fire extinguisher
##   last_review no_of_am Amenities_Wifi Amenities_Shampoo Amenities_Kitchen
## 1  2021-09-15       30              1                 1                 0
## 2  2021-12-12       28              1                 1                 1
## 3  2020-07-28       13              0                 1                 0
## 4  2021-12-07       45              1                 0                 1
##   Amenities_Long_Term Amenities_Washer Amenities_HairDryer Amenities_HotWater
## 1                   0                0                   1                  1
## 2                   1                0                   1                  1
## 3                   1                0                   1                  1
## 4                   1                1                   1                  1
##   Amenities_TV Amenities_AC
## 1            1            1
## 2            1            1
## 3            1            1
## 4            1            1
##                                                                           host_verifications
## 1 'email','phone','jumio','offline_government_id','selfie','government_id','identity_manual'
## 2 'email','phone','jumio','offline_government_id','selfie','government_id','identity_manual'
## 3               'email','phone','reviews','jumio','selfie','government_id','identity_manual'
## 4                                                                                    'phone'
##   hv_email hv_phone hv_facebook hv_reviews hv_manual_offline hv_manual_jumio
## 1        1        1           0          0                 0               1
## 2        1        1           0          0                 0               1
## 3        1        1           0          1                 0               1
## 4        0        1           0          0                 0               0
##   hv_manual_off_gov hv_manual_gov hv_manual_work_email no_of_vf
## 1                 1             1                    0        7
## 2                 1             1                    0        7
## 3                 0             1                    0        7
## 4                 0             0                    0        1
##   host_response_hours 1 0 host_response_day host_response_few_days
## 1                  NA 1 0                NA                     NA
## 2                  NA 1 0                NA                     NA
## 3                  NA 1 0                NA                     NA
## 4                  NA 1 0                NA                     NA
##   Days_since_last_review Capacity_Sqr Beds_Sqr Baths_Sqr ln_Price   ln_Beds
## 1                    107            4        1        NA 5.843544 0.6931472
## 2                     19           16        9        NA 5.298317 1.3862944
## 3                    521           25       16        NA 7.824446 1.6094379
## 4                     24           25        4        NA 5.666427 1.0986123
##   ln_Baths ln_Capacity ln_Rating Shared_ind House_ind Private_ind
## 1       NA    1.098612  1.781709          0         1           0
## 2       NA    1.609438  1.702928          0         1           0
## 3       NA    1.791759  1.747459          0         1           0
## 4       NA    1.791759  1.791759          0         1           0
##   Capacity_x_Shared_ind H_Cap P_Cap ln_Capacity_x_Shared_ind
## 1                     0     2     0                        0
## 2                     0     4     0                        0
## 3                     0     5     0                        0
## 4                     0     5     0                        0
##   ln_Capacity_x_House_ind ln_Capacity_x_Private_ind reviews_since_2019
## 1                1.098612                         0                217
## 2                1.609438                         0                177
## 3                1.791759                         0                 26
## 4                1.791759                         0                  7
##   bookings_since_2019 earnings_since_2019 nb_group_1 nb_group_2 nb_group_3
## 1                 434              447888          1          0          0
## 2                 354              211338          0          1          0
## 3                  52              390000          0          0          0
## 4                  14               12096          0          1          0
##   nb_group_4 nb_group_5
## 1          0          0
## 2          0          0
## 3          0          1
## 4          0          0
head(list_after_2019.sin %>% group_by(id, Property_Type, bookings_since_2019) %>% summarise(percent_of_total = bookings_since_2019*100/sum(list_after_2019.sin$bookings_since_2019)) %>% filter(Property_Type == "Boat") %>% arrange (desc(bookings_since_2019)))
## `summarise()` has grouped output by 'id', 'Property_Type'. You can override
## using the `.groups` argument.
## # A tibble: 4 × 4
## # Groups:   id, Property_Type [4]
##         id Property_Type bookings_since_2019 percent_of_total
##      <int> <chr>                       <dbl>            <dbl>
## 1 31527262 Boat                          434           0.967 
## 2 37907711 Boat                          354           0.788 
## 3 20247516 Boat                           52           0.116 
## 4 50433019 Boat                           14           0.0312

There are four boats listed on Airbnb Singapore. Together, they form roughly 2% of all bookings since 2019.

2.3 Correlation Matrices

list_of_vars = c("earnings_since_2019","Rating", "Reviews", "Beds", "Capacity", "host_acceptance_rate", "host_Superhost","Amenities_Wifi","Amenities_Shampoo","Amenities_Kitchen","Amenities_Long_Term","Amenities_Washer","Amenities_HairDryer", "Amenities_HotWater", "Amenities_TV", "Amenities_AC", "hv_email",  "hv_reviews", "Shared_ind", "House_ind", "Private_ind")#, "reviews_since_2019","bookings_since_2019") #, "hood_factor")
# list_after_2019.sin %>% select_(.dots = c(list_of_vars), "Price")
vars_list.sin = list_after_2019.sin %>% select_(.dots = c(list_of_vars,"Price")) %>% na.omit()
## Warning: `select_()` was deprecated in dplyr 0.7.0.
## Please use `select()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
vars_list.tpe = list_after_2019.tpe %>% select_(.dots = c(list_of_vars,"Price")) %>% na.omit()
vars_list.hkg = list_after_2019.hkg %>% select_(.dots = c(list_of_vars,"Price")) %>% na.omit()
vars_list.nrt = list_after_2019.nrt %>% select_(.dots = c(list_of_vars,"Price")) %>% na.omit()

# vars_list.sin
paint_corrleations <- function(listing)
{
  # chart.Correlation(listing, histogram=TRUE, pch=19)
  corrplot::corrplot(cor(listing, use = "complete.obs"), method="square", type="lower")
}
paint_corrleations(vars_list.sin)

paint_corrleations(vars_list.nrt)

paint_corrleations(vars_list.tpe)

paint_corrleations(vars_list.hkg)

3 Modelling

3.1 Principal Components Regression - Predicting Price

Principal Components Regression could find M linear combinations (“principal components”) of our predictors (list_of_vars) and then use least squares to fit a linear regression model.

set.seed(1)

pcr_model <- function (listings, city)
{
  pcr_model <- pcr( data=listings, scale=TRUE, validation="CV", Price ~ reviews_since_2019 + Rating + host_acceptance_rate +host_Superhost + reviews_since_2019 + Shared_ind + House_ind + Private_ind + Amenities_Wifi + hv_email) 
  
  summary(pcr_model)
  plot(pcr_model)
  validationplot(pcr_model, val.type="MSEP")
  validationplot(pcr_model, val.type="R2")
  print(paste("MAE for", city,":", mae(listings$Price, predict(pcr_model))))
  return (pcr_model)
}
pcr_model.sin <-pcr_model(list_after_2019.sin, "Singapore")
## Data:    X dimension: 1405 9 
##  Y dimension: 1405 1
## Fit method: svdpc
## Number of components considered: 9
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV           152.6    142.9    142.1    141.6    142.4    142.2    142.3
## adjCV        152.6    142.9    142.0    141.6    142.3    142.1    142.2
##        7 comps  8 comps  9 comps
## CV       140.8    140.7    140.6
## adjCV    140.7    140.6    140.5
## 
## TRAINING: % variance explained
##        1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps  8 comps
## X        22.14    40.73    53.25    64.19    74.57    84.27    92.71   100.00
## Price    12.89    13.94    14.45    14.45    15.23    15.23    16.90    17.21
##        9 comps
## X       100.00
## Price    17.21

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Singapore : 108.772298292697"
pcr_model.hkg <-pcr_model(list_after_2019.hkg, "Hong Kong")
## Data:    X dimension: 1945 9 
##  Y dimension: 1945 1
## Fit method: svdpc
## Number of components considered: 9
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV            2455     2455     2455     2454     2455     2456     2457
## adjCV         2455     2455     2455     2454     2455     2456     2456
##        7 comps  8 comps  9 comps
## CV        2458     2455     2455
## adjCV     2457     2454     2455
## 
## TRAINING: % variance explained
##        1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps   8 comps
## X      21.7730  37.2315  50.3304  62.1254  72.9752  83.0920  91.9797  100.0000
## Price   0.2567   0.2572   0.4119   0.4395   0.4482   0.5174   0.5345    0.8255
##         9 comps
## X      100.0000
## Price    0.8256

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Hong Kong : 869.547093791851"
pcr_model.tpe <-pcr_model(list_after_2019.tpe, "Taipei")
## Data:    X dimension: 2175 9 
##  Y dimension: 2175 1
## Fit method: svdpc
## Number of components considered: 9
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV            4229     4170     4170     4153     4154     4156     4148
## adjCV         4229     4170     4169     4153     4153     4156     4148
##        7 comps  8 comps  9 comps
## CV        4134     4131     4132
## adjCV     4133     4130     4131
## 
## TRAINING: % variance explained
##        1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps  8 comps
## X       24.408   39.493   52.740   64.281   74.915   84.836   93.121  100.000
## Price    2.953    3.029    3.792    3.795    3.795    4.154    4.932    5.071
##        9 comps
## X      100.000
## Price    5.078

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Taipei : 2209.48295156008"
pcr_model.nrt <-pcr_model(list_after_2019.nrt, "Tokyo")
## Data:    X dimension: 7249 9 
##  Y dimension: 7249 1
## Fit method: svdpc
## Number of components considered: 9
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV           29610    29516    29490    29496    29487    29489    29492
## adjCV        29610    29516    29489    29495    29486    29488    29491
##        7 comps  8 comps  9 comps
## CV       29492    29494    29495
## adjCV    29490    29492    29493
## 
## TRAINING: % variance explained
##        1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps   8 comps
## X      22.2622  37.4091  49.4255  61.1240  71.8952  82.0212  91.8955  100.0000
## Price   0.6487   0.8555   0.8623   0.9325   0.9325   0.9327   0.9626    0.9718
##         9 comps
## X      100.0000
## Price    0.9733

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Tokyo : 12041.061870605"

The mean absolute error for each city is about $108.

3.2 Predicting Earnings

Price is a constant in the dataset and is in fact recommended by Airbnb itself. Instead, it makes more sense to model for earnings than price, as earnings is also dependent on number of bookings for a given price.

set.seed(1)

pcr_model_earnings <- function (listings, city)
{
  pcr_model <- pcr( data=listings, scale=TRUE, validation="CV", earnings_since_2019 ~ Price + reviews_since_2019 + Rating + host_acceptance_rate +host_Superhost + reviews_since_2019 + Shared_ind + House_ind + Private_ind + Amenities_Wifi + hv_email) #", 
                   # "host_Superhost", "no_of_am","Amenities_Wifi","Amenities_Shampoo","Amenities_Kitchen","Amenities_Long_Term","Amenities_Washer",
                   # "Amenities_HairDryer", "Amenities_HotWater", "Amenities_TV", "Amenities_AC", "hv_email",  "hv_facebook", "hv_reviews",
                   # "hv_manual_offline", "hv_manual_jumio", "hv_manual_off_gov", "hv_manual_gov", "hv_manual_work_email", "no_of_vf", "Days_since_last_review",
                   # , "reviews_since_2019","bookings_since_2019")
  
  summary(pcr_model)
  plot(pcr_model)
  validationplot(pcr_model, val.type="MSEP")
  validationplot(pcr_model, val.type="R2")
  print(paste("MAE for", city,":", mae(listings$earnings_since_2019, predict(pcr_model))))
  return (pcr_model)
}
pcr_model.sin <-pcr_model_earnings(list_after_2019.sin, "Singapore")
## Data:    X dimension: 1405 10 
##  Y dimension: 1405 1
## Fit method: svdpc
## Number of components considered: 10
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV           31428    29557    28372    28288    25678    25378    22925
## adjCV        31428    29554    28366    28286    24823    25441    23114
##        7 comps  8 comps  9 comps  10 comps
## CV       17510    17465    16439     16436
## adjCV    17453    17414    16395     16391
## 
## TRAINING: % variance explained
##                      1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## X                      22.14    38.99     50.4    60.24    69.93    78.66
## earnings_since_2019    11.70    18.99     19.6    37.12    41.06    52.10
##                      7 comps  8 comps  9 comps  10 comps
## X                      87.03    93.82   100.00    100.00
## earnings_since_2019    71.88    72.12    75.15     75.15

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Singapore : 20085.8644280392"
pcr_model.hkg <-pcr_model_earnings(list_after_2019.hkg, "Hong Kong")
## Data:    X dimension: 1945 10 
##  Y dimension: 1945 1
## Fit method: svdpc
## Number of components considered: 10
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV          410901   402205   390090   377536   381124   376353   341603
## adjCV       410901   402288   390081   377417   381617   367444   338156
##        7 comps  8 comps  9 comps  10 comps
## CV      346454   318750   295779    295967
## adjCV   343001   315592   292494    292664
## 
## TRAINING: % variance explained
##                      1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## X                     19.648    33.56    45.45    56.10    66.09    75.83
## earnings_since_2019    3.538    10.11    15.39    15.39    48.30    49.76
##                      7 comps  8 comps  9 comps  10 comps
## X                      84.87    92.86   100.00    100.00
## earnings_since_2019    49.91    58.56    65.66     65.66

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Hong Kong : 200007.767176238"
pcr_model.tpe <-pcr_model_earnings(list_after_2019.tpe, "Taipei")
## Data:    X dimension: 2175 10 
##  Y dimension: 2175 1
## Fit method: svdpc
## Number of components considered: 10
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV         1181326  1104149  1103314  1071703  1072319   990895   976066
## adjCV      1181326  1104158  1103341  1071364  1073671   959962   972962
##        7 comps  8 comps  9 comps  10 comps
## CV      967203   780310   753804    752060
## adjCV   964362   776832   750544    748890
## 
## TRAINING: % variance explained
##                      1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## X                      22.49    36.10    48.35    58.74    68.46    78.03
## earnings_since_2019    12.47    12.77    18.39    18.73    38.85    38.90
##                      7 comps  8 comps  9 comps  10 comps
## X                      86.66    93.84   100.00    100.00
## earnings_since_2019    42.44    62.81    65.52     65.52

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Taipei : 621874.28842328"
pcr_model.nrt <-pcr_model_earnings(list_after_2019.nrt, "Tokyo")
## Data:    X dimension: 7249 10 
##  Y dimension: 7249 1
## Fit method: svdpc
## Number of components considered: 10
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV         7764047  7613827  7340841  7216238  7147811  5899329  5450582
## adjCV      7764047  7613976  7341342  7222899  7149043  5652462  5440735
##        7 comps  8 comps  9 comps  10 comps
## CV     4892522  4785782  4762564   4762764
## adjCV  4863554  4777938  4754861   4755034
## 
## TRAINING: % variance explained
##                      1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## X                     20.165    33.87    44.69    55.31    65.04    74.74
## earnings_since_2019    3.815    10.69    14.11    15.84    50.72    53.89
##                      7 comps  8 comps  9 comps  10 comps
## X                      83.85    92.71   100.00    100.00
## earnings_since_2019    63.09    64.32    64.64     64.64

## Warning in actual - predicted: longer object length is not a multiple of shorter
## object length

## [1] "MAE for Tokyo : 3905051.77520906"

4. Step-wise Regression

4.1 Cleaning Data

We have two ways of cleaning:

* **clean_subset_including**: To select the variables we want (I used this for stepwise since built-in stepwise regression function automatically creates dummy variables)
* **clean_subset_including**: To select the variables we don't want (I used this for Lasso since there is a lot of dummy variables and I rather exclude those not needed)
### For checking number of missing data
# md.pattern(list_after_2019.country)

clean_subset_including <- function(list_after_2019.country) {
  
### Arbitrary selection of a list of variables 
  selecting_columns <- list_after_2019.country[,c("Reviews","Beds","Capacity","Monthly_Reviews","Property_Type","Room_Type","Rating","neighbourhood_cleansed","host_response_time","host_acceptance_rate","host_Superhost","no_of_am","Amenities_Wifi","Amenities_Shampoo","Amenities_Kitchen","Amenities_Long_Term","Amenities_Washer","Amenities_HairDryer","Amenities_HotWater","Amenities_TV","Amenities_AC","hv_email","hv_phone","hv_facebook","hv_reviews","hv_manual_offline","hv_manual_jumio","hv_manual_off_gov","hv_manual_gov","hv_manual_work_email","no_of_vf","Days_since_last_review","Capacity_Sqr","Beds_Sqr","ln_Beds","ln_Capacity","ln_Rating","Shared_ind","House_ind","Private_ind","Capacity_x_Shared_ind","H_Cap","P_Cap","ln_Capacity_x_Shared_ind","ln_Capacity_x_House_ind","ln_Capacity_x_Private_ind","reviews_since_2019","bookings_since_2019", "earnings_since_2019","nb_group_1","nb_group_2","nb_group_3","nb_group_4","nb_group_5" )]

### Removing rows with blanks instead of imputing
selecting_columns <- na.omit(selecting_columns)
selecting_columns$Property_Type <- as.factor(selecting_columns$Property_Type)
selecting_columns$neighbourhood_cleansed  <- as.factor(selecting_columns$neighbourhood_cleansed )
selecting_columns$host_response_time <- as.factor(selecting_columns$host_response_time)

  return(selecting_columns)
}

###############################################################################################################

clean_subset_excluding <- function(list_after_2019.country) {
  
selecting_columns <- list_after_2019.country[,!names(list_after_2019.country) %in%  c('id','Price','ln_Price','host_response_time','host_response_rate','host_verifications','Baths','Baths_Sqr','ln_Baths','latitude','longitude','neighbourhood_cleansed','amenities','last_review','1','0','host_response_hours','host_response_day','host_response_few_days')]

selecting_columns <- na.omit(selecting_columns)
  return(selecting_columns)
}

###############################################################################################################

# For Stepwise Regression Input
list_after_2019.sin_step <- clean_subset_including(list_after_2019.sin)
list_after_2019.hkg_step <- clean_subset_including(list_after_2019.hkg)
list_after_2019.nrt_step <- clean_subset_including(list_after_2019.nrt)
list_after_2019.tpe_step <- clean_subset_including(list_after_2019.tpe)

# For Lasso Regression Input
list_after_2019.sin_clean <- clean_subset_excluding(list_after_2019.sin_remove)
list_after_2019.hkg_clean <- clean_subset_excluding(list_after_2019.hkg_remove)
list_after_2019.nrt_clean <- clean_subset_excluding(list_after_2019.nrt_remove)
list_after_2019.tpe_clean <- clean_subset_excluding(list_after_2019.tpe_remove)

4.2 Feature Selection - Backward Stepwise Regression

The R-squared value here is 0.8

stepwise_regression_model <- function(list_after_2019.country_step) {
  
  #Define Smallest and Full Model 
  minmod = lm(earnings_since_2019~1, data = list_after_2019.country_step)
  fullmod = lm(earnings_since_2019~. , data = list_after_2019.country_step)
  
  # Using BIC: k=log(nobs(fullmod), Using AIC: k=2
  backward_regression_model <- step(fullmod, scope = list(lower = minmod, upper = fullmod),direction = "backward", k=log(nobs(fullmod)), trace=F)
  return (backward_regression_model)
}


summary(stepwise_regression_model(list_after_2019.sin_step))
## 
## Call:
## lm(formula = earnings_since_2019 ~ Reviews + Monthly_Reviews + 
##     Property_Type + host_Superhost + Amenities_Wifi + hv_manual_gov + 
##     no_of_vf + H_Cap + ln_Capacity_x_House_ind + reviews_since_2019, 
##     data = list_after_2019.country_step)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -155312   -4254     279    3395  189765 
## 
## Coefficients:
##                                                    Estimate Std. Error t value
## (Intercept)                                       187433.62   10354.21  18.102
## Reviews                                              -85.69      18.28  -4.689
## Monthly_Reviews                                    -4210.84     937.01  -4.494
## Property_TypeCampsite                            -175695.52   12187.31 -14.416
## Property_TypeEntire condominium (condo)          -155254.55    7979.33 -19.457
## Property_TypeEntire guest suite                  -172580.58   13310.45 -12.966
## Property_TypeEntire guesthouse                   -156938.24   13399.55 -11.712
## Property_TypeEntire loft                         -157620.54   11799.96 -13.358
## Property_TypeEntire place                        -152593.14   11862.39 -12.864
## Property_TypeEntire rental unit                  -158833.97    7994.67 -19.867
## Property_TypeEntire residential home             -161667.85    8863.15 -18.240
## Property_TypeEntire serviced apartment           -154620.77    8055.14 -19.195
## Property_TypeEntire townhouse                    -162330.65   13318.70 -12.188
## Property_TypePrivate room                        -171680.40   11870.51 -14.463
## Property_TypePrivate room in bed and breakfast   -167769.83   11328.35 -14.810
## Property_TypePrivate room in bungalow            -166950.33   10560.68 -15.809
## Property_TypePrivate room in condominium (condo) -167160.08    9868.85 -16.938
## Property_TypePrivate room in guest suite         -166760.52   18033.46  -9.247
## Property_TypePrivate room in hostel              -163602.33   10778.25 -15.179
## Property_TypePrivate room in loft                -171014.90   11790.56 -14.504
## Property_TypePrivate room in rental unit         -169044.91    9787.60 -17.271
## Property_TypePrivate room in residential home    -173183.61    9823.84 -17.629
## Property_TypePrivate room in serviced apartment  -164442.30   10390.40 -15.826
## Property_TypePrivate room in townhouse           -170533.80   10070.03 -16.935
## Property_TypePrivate room in villa               -169661.60   11300.51 -15.014
## Property_TypeRoom in aparthotel                  -167088.38   18016.06  -9.274
## Property_TypeRoom in boutique hotel              -164189.59    9759.79 -16.823
## Property_TypeRoom in hotel                       -162818.84    9986.37 -16.304
## Property_TypeShared room                         -169009.50   11598.85 -14.571
## Property_TypeShared room in bed and breakfast    -168024.94   10779.65 -15.587
## Property_TypeShared room in boutique hotel       -167198.96   13110.99 -12.753
## Property_TypeShared room in hostel               -169297.87   10273.78 -16.479
## Property_TypeShared room in rental unit          -179732.59   11831.39 -15.191
## Property_TypeShared room in residential home     -165368.85   14510.77 -11.396
## Property_TypeTent                                -178738.36   17365.81 -10.293
## Property_TypeTiny house                          -119439.31   13303.70  -8.978
## host_Superhost                                      3908.46     919.67   4.250
## Amenities_Wifi                                    -15441.34    4298.87  -3.592
## hv_manual_gov                                       7210.26    1464.49   4.923
## no_of_vf                                           -2361.64     361.54  -6.532
## H_Cap                                               8883.74    1258.71   7.058
## ln_Capacity_x_House_ind                           -24416.98    6518.16  -3.746
## reviews_since_2019                                  1176.44      38.33  30.696
##                                                  Pr(>|t|)    
## (Intercept)                                       < 2e-16 ***
## Reviews                                          3.03e-06 ***
## Monthly_Reviews                                  7.60e-06 ***
## Property_TypeCampsite                             < 2e-16 ***
## Property_TypeEntire condominium (condo)           < 2e-16 ***
## Property_TypeEntire guest suite                   < 2e-16 ***
## Property_TypeEntire guesthouse                    < 2e-16 ***
## Property_TypeEntire loft                          < 2e-16 ***
## Property_TypeEntire place                         < 2e-16 ***
## Property_TypeEntire rental unit                   < 2e-16 ***
## Property_TypeEntire residential home              < 2e-16 ***
## Property_TypeEntire serviced apartment            < 2e-16 ***
## Property_TypeEntire townhouse                     < 2e-16 ***
## Property_TypePrivate room                         < 2e-16 ***
## Property_TypePrivate room in bed and breakfast    < 2e-16 ***
## Property_TypePrivate room in bungalow             < 2e-16 ***
## Property_TypePrivate room in condominium (condo)  < 2e-16 ***
## Property_TypePrivate room in guest suite          < 2e-16 ***
## Property_TypePrivate room in hostel               < 2e-16 ***
## Property_TypePrivate room in loft                 < 2e-16 ***
## Property_TypePrivate room in rental unit          < 2e-16 ***
## Property_TypePrivate room in residential home     < 2e-16 ***
## Property_TypePrivate room in serviced apartment   < 2e-16 ***
## Property_TypePrivate room in townhouse            < 2e-16 ***
## Property_TypePrivate room in villa                < 2e-16 ***
## Property_TypeRoom in aparthotel                   < 2e-16 ***
## Property_TypeRoom in boutique hotel               < 2e-16 ***
## Property_TypeRoom in hotel                        < 2e-16 ***
## Property_TypeShared room                          < 2e-16 ***
## Property_TypeShared room in bed and breakfast     < 2e-16 ***
## Property_TypeShared room in boutique hotel        < 2e-16 ***
## Property_TypeShared room in hostel                < 2e-16 ***
## Property_TypeShared room in rental unit           < 2e-16 ***
## Property_TypeShared room in residential home      < 2e-16 ***
## Property_TypeTent                                 < 2e-16 ***
## Property_TypeTiny house                           < 2e-16 ***
## host_Superhost                                   2.29e-05 ***
## Amenities_Wifi                                   0.000340 ***
## hv_manual_gov                                    9.57e-07 ***
## no_of_vf                                         9.20e-11 ***
## H_Cap                                            2.71e-12 ***
## ln_Capacity_x_House_ind                          0.000187 ***
## reviews_since_2019                                < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15170 on 1330 degrees of freedom
## Multiple R-squared:  0.7785, Adjusted R-squared:  0.7715 
## F-statistic: 111.3 on 42 and 1330 DF,  p-value: < 2.2e-16
summary(stepwise_regression_model(list_after_2019.hkg_step))
## 
## Call:
## lm(formula = earnings_since_2019 ~ Reviews + neighbourhood_cleansed + 
##     hv_manual_jumio + hv_manual_gov + Days_since_last_review + 
##     H_Cap + reviews_since_2019, data = list_after_2019.country_step)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -4231374   -56096    -8498    29868  8234314 
## 
## Coefficients:
##                                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                          -50166.34   26759.51  -1.875 0.060986 .  
## Reviews                                -691.75     183.45  -3.771 0.000168 ***
## neighbourhood_cleansedEastern         -1312.14   46076.05  -0.028 0.977284    
## neighbourhood_cleansedIslands         -6024.28   31442.03  -0.192 0.848076    
## neighbourhood_cleansedKowloon City    -6505.92   42112.57  -0.154 0.877241    
## neighbourhood_cleansedKwai Tsing      26204.03  225261.95   0.116 0.907406    
## neighbourhood_cleansedKwun Tong      -31013.45  184191.07  -0.168 0.866305    
## neighbourhood_cleansedNorth          -29173.70   46986.46  -0.621 0.534743    
## neighbourhood_cleansedSai Kung       122437.57   55735.96   2.197 0.028159 *  
## neighbourhood_cleansedSha Tin         10976.65   75705.09   0.145 0.884732    
## neighbourhood_cleansedSham Shui Po   -30313.31   69380.60  -0.437 0.662224    
## neighbourhood_cleansedSouthern        35842.82   68099.71   0.526 0.598721    
## neighbourhood_cleansedTai Po          67172.51   80117.87   0.838 0.401900    
## neighbourhood_cleansedTsuen Wan     4233457.53  184206.47  22.982  < 2e-16 ***
## neighbourhood_cleansedTuen Mun        47598.86  102477.90   0.464 0.642358    
## neighbourhood_cleansedWan Chai       -34554.08   29100.74  -1.187 0.235220    
## neighbourhood_cleansedWong Tai Sin    -2356.44  317666.64  -0.007 0.994082    
## neighbourhood_cleansedYau Tsim Mong  -22565.15   24172.81  -0.933 0.350684    
## neighbourhood_cleansedYuen Long      -42325.19   46028.10  -0.920 0.357924    
## hv_manual_jumio                      160848.12   37594.72   4.278 1.98e-05 ***
## hv_manual_gov                       -184078.09   37774.85  -4.873 1.19e-06 ***
## Days_since_last_review                   86.50      23.13   3.739 0.000190 ***
## H_Cap                                 18941.67    2761.51   6.859 9.34e-12 ***
## reviews_since_2019                     7585.89     396.22  19.146  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 316900 on 1898 degrees of freedom
## Multiple R-squared:  0.4187, Adjusted R-squared:  0.4116 
## F-statistic: 59.43 on 23 and 1898 DF,  p-value: < 2.2e-16
summary(stepwise_regression_model(list_after_2019.nrt_step))
## 
## Call:
## lm(formula = earnings_since_2019 ~ Reviews + Capacity + Monthly_Reviews + 
##     hv_reviews + Days_since_last_review + Capacity_Sqr + ln_Capacity + 
##     reviews_since_2019, data = list_after_2019.country_step)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -31027530  -1465859    -33262    900095 240476057 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             1429357.3   965276.6   1.481  0.13871    
## Reviews                  -12944.8     2960.5  -4.373 1.25e-05 ***
## Capacity                2210651.3   461045.1   4.795 1.66e-06 ***
## Monthly_Reviews          353356.1    98155.3   3.600  0.00032 ***
## hv_reviews              -750024.7   177135.9  -4.234 2.32e-05 ***
## Days_since_last_review     1660.3      314.3   5.283 1.31e-07 ***
## Capacity_Sqr             -48167.0    16128.4  -2.986  0.00283 ** 
## ln_Capacity            -6679136.8  1592238.3  -4.195 2.76e-05 ***
## reviews_since_2019       130606.4     5078.2  25.719  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6606000 on 7063 degrees of freedom
## Multiple R-squared:  0.2914, Adjusted R-squared:  0.2906 
## F-statistic: 363.1 on 8 and 7063 DF,  p-value: < 2.2e-16
summary(stepwise_regression_model(list_after_2019.tpe_step))
## 
## Call:
## lm(formula = earnings_since_2019 ~ hv_phone + H_Cap + ln_Capacity_x_House_ind + 
##     reviews_since_2019, data = list_after_2019.country_step)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3341907  -146373    11476    81599 33174950 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              2925762.9   433853.5   6.744 1.98e-11 ***
## hv_phone                -2966452.6   432318.3  -6.862 8.88e-12 ***
## H_Cap                     191366.5    17877.8  10.704  < 2e-16 ***
## ln_Capacity_x_House_ind  -445223.8    68206.3  -6.528 8.32e-11 ***
## reviews_since_2019         14139.8      839.1  16.851  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1056000 on 2140 degrees of freedom
## Multiple R-squared:  0.2118, Adjusted R-squared:  0.2103 
## F-statistic: 143.7 on 4 and 2140 DF,  p-value: < 2.2e-16
# Putting forward stepwise regression in comments in case we need to use 
# forward_regression = step(minmod, scope = list(lower = minmod, upper = fullmod),direction = "forward", k=log(nobs(fullmod)), trace=F)
# summary(forward_regression)

4.3 Backward Stepwise Regression using Leaps Package

# library(leaps)
backwardstep_leaps_sin <- regsubsets(earnings_since_2019~., data = list_after_2019.sin_step, nvmax = 5,method = "backward")
## Warning in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in =
## force.in, : 14 linear dependencies found
## Reordering variables and trying again:
backwardstep_leaps_hkg <- regsubsets(earnings_since_2019~., data = list_after_2019.hkg_step, nvmax = 5,method = "backward")
## Warning in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in =
## force.in, : 12 linear dependencies found
## Reordering variables and trying again:
backwardstep_leaps_nrt <- regsubsets(earnings_since_2019~., data = list_after_2019.nrt_step, nvmax = 5,method = "backward")
## Warning in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in =
## force.in, : 12 linear dependencies found
## Reordering variables and trying again:
backwardstep_leaps_tpe <- regsubsets(earnings_since_2019~., data = list_after_2019.tpe_step, nvmax = 5,method = "backward")
## Warning in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in =
## force.in, : 12 linear dependencies found
## Reordering variables and trying again:
summary(backwardstep_leaps_sin)$which
##   (Intercept) Reviews  Beds Capacity Monthly_Reviews Property_TypeCampsite
## 1        TRUE   FALSE FALSE    FALSE           FALSE                 FALSE
## 2        TRUE   FALSE FALSE    FALSE           FALSE                 FALSE
## 3        TRUE   FALSE FALSE    FALSE           FALSE                 FALSE
## 4        TRUE   FALSE FALSE    FALSE           FALSE                 FALSE
## 5        TRUE   FALSE FALSE    FALSE           FALSE                 FALSE
## 6        TRUE   FALSE FALSE    FALSE           FALSE                 FALSE
##   Property_TypeEntire condominium (condo) Property_TypeEntire guest suite
## 1                                   FALSE                           FALSE
## 2                                   FALSE                           FALSE
## 3                                   FALSE                           FALSE
## 4                                   FALSE                           FALSE
## 5                                    TRUE                           FALSE
## 6                                    TRUE                           FALSE
##   Property_TypeEntire guesthouse Property_TypeEntire loft
## 1                          FALSE                    FALSE
## 2                          FALSE                    FALSE
## 3                          FALSE                    FALSE
## 4                          FALSE                    FALSE
## 5                          FALSE                    FALSE
## 6                          FALSE                    FALSE
##   Property_TypeEntire place Property_TypeEntire rental unit
## 1                     FALSE                           FALSE
## 2                     FALSE                           FALSE
## 3                     FALSE                            TRUE
## 4                     FALSE                            TRUE
## 5                     FALSE                            TRUE
## 6                     FALSE                            TRUE
##   Property_TypeEntire residential home Property_TypeEntire serviced apartment
## 1                                FALSE                                  FALSE
## 2                                FALSE                                  FALSE
## 3                                FALSE                                  FALSE
## 4                                FALSE                                  FALSE
## 5                                FALSE                                  FALSE
## 6                                FALSE                                   TRUE
##   Property_TypeEntire townhouse Property_TypePrivate room
## 1                         FALSE                     FALSE
## 2                         FALSE                     FALSE
## 3                         FALSE                     FALSE
## 4                         FALSE                     FALSE
## 5                         FALSE                     FALSE
## 6                         FALSE                     FALSE
##   Property_TypePrivate room in bed and breakfast
## 1                                          FALSE
## 2                                          FALSE
## 3                                          FALSE
## 4                                          FALSE
## 5                                          FALSE
## 6                                          FALSE
##   Property_TypePrivate room in bungalow
## 1                                 FALSE
## 2                                 FALSE
## 3                                 FALSE
## 4                                 FALSE
## 5                                 FALSE
## 6                                 FALSE
##   Property_TypePrivate room in condominium (condo)
## 1                                            FALSE
## 2                                            FALSE
## 3                                            FALSE
## 4                                            FALSE
## 5                                            FALSE
## 6                                            FALSE
##   Property_TypePrivate room in guest suite Property_TypePrivate room in hostel
## 1                                    FALSE                               FALSE
## 2                                    FALSE                               FALSE
## 3                                    FALSE                               FALSE
## 4                                    FALSE                               FALSE
## 5                                    FALSE                               FALSE
## 6                                    FALSE                               FALSE
##   Property_TypePrivate room in loft Property_TypePrivate room in rental unit
## 1                             FALSE                                    FALSE
## 2                             FALSE                                    FALSE
## 3                             FALSE                                    FALSE
## 4                             FALSE                                    FALSE
## 5                             FALSE                                    FALSE
## 6                             FALSE                                    FALSE
##   Property_TypePrivate room in residential home
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                          TRUE
## 5                                          TRUE
## 6                                          TRUE
##   Property_TypePrivate room in serviced apartment
## 1                                           FALSE
## 2                                           FALSE
## 3                                           FALSE
## 4                                           FALSE
## 5                                           FALSE
## 6                                           FALSE
##   Property_TypePrivate room in townhouse Property_TypePrivate room in villa
## 1                                  FALSE                              FALSE
## 2                                  FALSE                              FALSE
## 3                                  FALSE                              FALSE
## 4                                  FALSE                              FALSE
## 5                                  FALSE                              FALSE
## 6                                  FALSE                              FALSE
##   Property_TypeRoom in aparthotel Property_TypeRoom in boutique hotel
## 1                           FALSE                               FALSE
## 2                           FALSE                               FALSE
## 3                           FALSE                               FALSE
## 4                           FALSE                               FALSE
## 5                           FALSE                               FALSE
## 6                           FALSE                               FALSE
##   Property_TypeRoom in hotel Property_TypeShared room
## 1                      FALSE                    FALSE
## 2                      FALSE                    FALSE
## 3                      FALSE                    FALSE
## 4                      FALSE                    FALSE
## 5                      FALSE                    FALSE
## 6                      FALSE                    FALSE
##   Property_TypeShared room in bed and breakfast
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypeShared room in boutique hotel Property_TypeShared room in hostel
## 1                                      FALSE                              FALSE
## 2                                      FALSE                              FALSE
## 3                                      FALSE                              FALSE
## 4                                      FALSE                              FALSE
## 5                                      FALSE                              FALSE
## 6                                      FALSE                              FALSE
##   Property_TypeShared room in rental unit
## 1                                   FALSE
## 2                                   FALSE
## 3                                   FALSE
## 4                                   FALSE
## 5                                   FALSE
## 6                                   FALSE
##   Property_TypeShared room in residential home Property_TypeTent
## 1                                        FALSE             FALSE
## 2                                        FALSE             FALSE
## 3                                        FALSE             FALSE
## 4                                        FALSE             FALSE
## 5                                        FALSE             FALSE
## 6                                        FALSE             FALSE
##   Property_TypeTiny house Room_TypePrivate room Room_TypeEntire home/apt Rating
## 1                   FALSE                 FALSE                    FALSE  FALSE
## 2                   FALSE                 FALSE                    FALSE  FALSE
## 3                   FALSE                 FALSE                    FALSE  FALSE
## 4                   FALSE                 FALSE                    FALSE  FALSE
## 5                   FALSE                 FALSE                    FALSE  FALSE
## 6                   FALSE                 FALSE                    FALSE  FALSE
##   neighbourhood_cleansedBedok neighbourhood_cleansedBishan
## 1                       FALSE                        FALSE
## 2                       FALSE                        FALSE
## 3                       FALSE                        FALSE
## 4                       FALSE                        FALSE
## 5                       FALSE                        FALSE
## 6                       FALSE                        FALSE
##   neighbourhood_cleansedBukit Batok neighbourhood_cleansedBukit Merah
## 1                             FALSE                             FALSE
## 2                             FALSE                             FALSE
## 3                             FALSE                             FALSE
## 4                             FALSE                             FALSE
## 5                             FALSE                             FALSE
## 6                             FALSE                             FALSE
##   neighbourhood_cleansedBukit Panjang neighbourhood_cleansedBukit Timah
## 1                               FALSE                             FALSE
## 2                               FALSE                             FALSE
## 3                               FALSE                             FALSE
## 4                               FALSE                             FALSE
## 5                               FALSE                             FALSE
## 6                               FALSE                             FALSE
##   neighbourhood_cleansedCentral Water Catchment
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   neighbourhood_cleansedChoa Chu Kang neighbourhood_cleansedClementi
## 1                               FALSE                          FALSE
## 2                               FALSE                          FALSE
## 3                               FALSE                          FALSE
## 4                               FALSE                          FALSE
## 5                               FALSE                          FALSE
## 6                               FALSE                          FALSE
##   neighbourhood_cleansedDowntown Core neighbourhood_cleansedGeylang
## 1                               FALSE                         FALSE
## 2                               FALSE                         FALSE
## 3                               FALSE                         FALSE
## 4                               FALSE                         FALSE
## 5                               FALSE                         FALSE
## 6                               FALSE                         FALSE
##   neighbourhood_cleansedHougang neighbourhood_cleansedJurong East
## 1                         FALSE                             FALSE
## 2                         FALSE                             FALSE
## 3                         FALSE                             FALSE
## 4                         FALSE                             FALSE
## 5                         FALSE                             FALSE
## 6                         FALSE                             FALSE
##   neighbourhood_cleansedJurong West neighbourhood_cleansedKallang
## 1                             FALSE                         FALSE
## 2                             FALSE                         FALSE
## 3                             FALSE                         FALSE
## 4                             FALSE                         FALSE
## 5                             FALSE                         FALSE
## 6                             FALSE                         FALSE
##   neighbourhood_cleansedLim Chu Kang neighbourhood_cleansedMandai
## 1                              FALSE                        FALSE
## 2                              FALSE                        FALSE
## 3                              FALSE                        FALSE
## 4                              FALSE                        FALSE
## 5                              FALSE                        FALSE
## 6                              FALSE                        FALSE
##   neighbourhood_cleansedMarina South neighbourhood_cleansedMarine Parade
## 1                              FALSE                               FALSE
## 2                              FALSE                               FALSE
## 3                              FALSE                               FALSE
## 4                              FALSE                               FALSE
## 5                              FALSE                               FALSE
## 6                              FALSE                               FALSE
##   neighbourhood_cleansedMuseum neighbourhood_cleansedNewton
## 1                        FALSE                        FALSE
## 2                        FALSE                        FALSE
## 3                        FALSE                        FALSE
## 4                        FALSE                        FALSE
## 5                        FALSE                        FALSE
## 6                        FALSE                        FALSE
##   neighbourhood_cleansedNovena neighbourhood_cleansedOrchard
## 1                        FALSE                         FALSE
## 2                        FALSE                         FALSE
## 3                        FALSE                         FALSE
## 4                        FALSE                         FALSE
## 5                        FALSE                         FALSE
## 6                        FALSE                         FALSE
##   neighbourhood_cleansedOutram neighbourhood_cleansedPasir Ris
## 1                        FALSE                           FALSE
## 2                        FALSE                           FALSE
## 3                        FALSE                           FALSE
## 4                        FALSE                           FALSE
## 5                        FALSE                           FALSE
## 6                        FALSE                           FALSE
##   neighbourhood_cleansedPunggol neighbourhood_cleansedQueenstown
## 1                         FALSE                            FALSE
## 2                         FALSE                            FALSE
## 3                         FALSE                            FALSE
## 4                         FALSE                            FALSE
## 5                         FALSE                            FALSE
## 6                         FALSE                            FALSE
##   neighbourhood_cleansedRiver Valley neighbourhood_cleansedRochor
## 1                              FALSE                        FALSE
## 2                              FALSE                        FALSE
## 3                              FALSE                        FALSE
## 4                              FALSE                        FALSE
## 5                              FALSE                        FALSE
## 6                              FALSE                        FALSE
##   neighbourhood_cleansedSembawang neighbourhood_cleansedSengkang
## 1                           FALSE                          FALSE
## 2                           FALSE                          FALSE
## 3                           FALSE                          FALSE
## 4                           FALSE                          FALSE
## 5                           FALSE                          FALSE
## 6                           FALSE                          FALSE
##   neighbourhood_cleansedSerangoon neighbourhood_cleansedSingapore River
## 1                           FALSE                                 FALSE
## 2                           FALSE                                 FALSE
## 3                           FALSE                                 FALSE
## 4                           FALSE                                 FALSE
## 5                           FALSE                                 FALSE
## 6                           FALSE                                 FALSE
##   neighbourhood_cleansedSouthern Islands neighbourhood_cleansedTampines
## 1                                  FALSE                          FALSE
## 2                                  FALSE                          FALSE
## 3                                  FALSE                          FALSE
## 4                                  FALSE                          FALSE
## 5                                  FALSE                          FALSE
## 6                                  FALSE                          FALSE
##   neighbourhood_cleansedTanglin neighbourhood_cleansedToa Payoh
## 1                         FALSE                           FALSE
## 2                         FALSE                           FALSE
## 3                         FALSE                           FALSE
## 4                         FALSE                           FALSE
## 5                         FALSE                           FALSE
## 6                         FALSE                           FALSE
##   neighbourhood_cleansedWoodlands neighbourhood_cleansedYishun
## 1                           FALSE                        FALSE
## 2                           FALSE                        FALSE
## 3                           FALSE                        FALSE
## 4                           FALSE                        FALSE
## 5                           FALSE                        FALSE
## 6                           FALSE                        FALSE
##   host_response_timeN/A host_response_timewithin a day
## 1                 FALSE                          FALSE
## 2                 FALSE                          FALSE
## 3                 FALSE                          FALSE
## 4                 FALSE                          FALSE
## 5                 FALSE                          FALSE
## 6                 FALSE                          FALSE
##   host_response_timewithin a few hours host_response_timewithin an hour
## 1                                FALSE                            FALSE
## 2                                FALSE                            FALSE
## 3                                FALSE                            FALSE
## 4                                FALSE                            FALSE
## 5                                FALSE                            FALSE
## 6                                FALSE                            FALSE
##   host_acceptance_rate host_Superhost no_of_am Amenities_Wifi Amenities_Shampoo
## 1                FALSE          FALSE    FALSE          FALSE             FALSE
## 2                FALSE          FALSE    FALSE          FALSE             FALSE
## 3                FALSE          FALSE    FALSE          FALSE             FALSE
## 4                FALSE          FALSE    FALSE          FALSE             FALSE
## 5                FALSE          FALSE    FALSE          FALSE             FALSE
## 6                FALSE          FALSE    FALSE          FALSE             FALSE
##   Amenities_Kitchen Amenities_Long_Term Amenities_Washer Amenities_HairDryer
## 1             FALSE               FALSE            FALSE               FALSE
## 2             FALSE               FALSE            FALSE               FALSE
## 3             FALSE               FALSE            FALSE               FALSE
## 4             FALSE               FALSE            FALSE               FALSE
## 5             FALSE               FALSE            FALSE               FALSE
## 6             FALSE               FALSE            FALSE               FALSE
##   Amenities_HotWater Amenities_TV Amenities_AC hv_email hv_phone hv_facebook
## 1              FALSE        FALSE        FALSE    FALSE    FALSE       FALSE
## 2              FALSE        FALSE        FALSE    FALSE    FALSE       FALSE
## 3              FALSE        FALSE        FALSE    FALSE    FALSE       FALSE
## 4              FALSE        FALSE        FALSE    FALSE    FALSE       FALSE
## 5              FALSE        FALSE        FALSE    FALSE    FALSE       FALSE
## 6              FALSE        FALSE        FALSE    FALSE    FALSE       FALSE
##   hv_reviews hv_manual_offline hv_manual_jumio hv_manual_off_gov hv_manual_gov
## 1      FALSE             FALSE           FALSE             FALSE         FALSE
## 2      FALSE             FALSE           FALSE             FALSE         FALSE
## 3      FALSE             FALSE           FALSE             FALSE         FALSE
## 4      FALSE             FALSE           FALSE             FALSE         FALSE
## 5      FALSE             FALSE           FALSE             FALSE         FALSE
## 6      FALSE             FALSE           FALSE             FALSE         FALSE
##   hv_manual_work_email no_of_vf Days_since_last_review Capacity_Sqr Beds_Sqr
## 1                FALSE    FALSE                  FALSE        FALSE    FALSE
## 2                FALSE    FALSE                  FALSE        FALSE    FALSE
## 3                FALSE    FALSE                  FALSE        FALSE    FALSE
## 4                FALSE    FALSE                  FALSE        FALSE    FALSE
## 5                FALSE    FALSE                  FALSE        FALSE    FALSE
## 6                FALSE    FALSE                  FALSE        FALSE    FALSE
##   ln_Beds ln_Capacity ln_Rating Shared_ind House_ind Private_ind
## 1   FALSE       FALSE     FALSE      FALSE     FALSE       FALSE
## 2   FALSE       FALSE     FALSE      FALSE     FALSE       FALSE
## 3   FALSE       FALSE     FALSE      FALSE     FALSE       FALSE
## 4   FALSE       FALSE     FALSE      FALSE     FALSE       FALSE
## 5   FALSE       FALSE     FALSE      FALSE     FALSE       FALSE
## 6   FALSE       FALSE     FALSE      FALSE     FALSE       FALSE
##   Capacity_x_Shared_ind H_Cap P_Cap ln_Capacity_x_Shared_ind
## 1                 FALSE FALSE FALSE                    FALSE
## 2                 FALSE  TRUE FALSE                    FALSE
## 3                 FALSE  TRUE FALSE                    FALSE
## 4                 FALSE  TRUE FALSE                    FALSE
## 5                 FALSE  TRUE FALSE                    FALSE
## 6                 FALSE  TRUE FALSE                    FALSE
##   ln_Capacity_x_House_ind ln_Capacity_x_Private_ind reviews_since_2019
## 1                   FALSE                     FALSE               TRUE
## 2                   FALSE                     FALSE               TRUE
## 3                   FALSE                     FALSE               TRUE
## 4                   FALSE                     FALSE               TRUE
## 5                   FALSE                     FALSE               TRUE
## 6                   FALSE                     FALSE               TRUE
##   bookings_since_2019 nb_group_1 nb_group_2 nb_group_3 nb_group_4 nb_group_5
## 1               FALSE      FALSE      FALSE      FALSE      FALSE      FALSE
## 2               FALSE      FALSE      FALSE      FALSE      FALSE      FALSE
## 3               FALSE      FALSE      FALSE      FALSE      FALSE      FALSE
## 4               FALSE      FALSE      FALSE      FALSE      FALSE      FALSE
## 5               FALSE      FALSE      FALSE      FALSE      FALSE      FALSE
## 6               FALSE      FALSE      FALSE      FALSE      FALSE      FALSE
summary(backwardstep_leaps_hkg)$which
##   (Intercept) Reviews  Beds Capacity Monthly_Reviews Property_TypeCastle
## 1        TRUE   FALSE FALSE    FALSE           FALSE               FALSE
## 2        TRUE   FALSE FALSE    FALSE           FALSE               FALSE
## 3        TRUE   FALSE FALSE    FALSE           FALSE               FALSE
## 4        TRUE   FALSE  TRUE    FALSE           FALSE               FALSE
## 5        TRUE    TRUE  TRUE    FALSE           FALSE               FALSE
## 6        TRUE    TRUE  TRUE    FALSE           FALSE               FALSE
##   Property_TypeEarth house Property_TypeEntire bungalow
## 1                    FALSE                        FALSE
## 2                    FALSE                        FALSE
## 3                    FALSE                        FALSE
## 4                    FALSE                        FALSE
## 5                    FALSE                        FALSE
## 6                    FALSE                        FALSE
##   Property_TypeEntire chalet Property_TypeEntire condominium (condo)
## 1                      FALSE                                   FALSE
## 2                      FALSE                                   FALSE
## 3                      FALSE                                   FALSE
## 4                      FALSE                                   FALSE
## 5                      FALSE                                   FALSE
## 6                      FALSE                                   FALSE
##   Property_TypeEntire cottage Property_TypeEntire guest suite
## 1                       FALSE                           FALSE
## 2                       FALSE                           FALSE
## 3                       FALSE                           FALSE
## 4                       FALSE                           FALSE
## 5                       FALSE                           FALSE
## 6                       FALSE                           FALSE
##   Property_TypeEntire guesthouse Property_TypeEntire loft
## 1                          FALSE                    FALSE
## 2                          FALSE                    FALSE
## 3                          FALSE                    FALSE
## 4                          FALSE                    FALSE
## 5                          FALSE                    FALSE
## 6                          FALSE                    FALSE
##   Property_TypeEntire place Property_TypeEntire rental unit
## 1                     FALSE                           FALSE
## 2                     FALSE                           FALSE
## 3                     FALSE                           FALSE
## 4                     FALSE                           FALSE
## 5                     FALSE                           FALSE
## 6                     FALSE                           FALSE
##   Property_TypeEntire residential home Property_TypeEntire serviced apartment
## 1                                FALSE                                  FALSE
## 2                                FALSE                                  FALSE
## 3                                FALSE                                  FALSE
## 4                                FALSE                                  FALSE
## 5                                FALSE                                  FALSE
## 6                                FALSE                                  FALSE
##   Property_TypeEntire townhouse Property_TypeEntire villa
## 1                         FALSE                     FALSE
## 2                         FALSE                     FALSE
## 3                         FALSE                     FALSE
## 4                         FALSE                     FALSE
## 5                         FALSE                     FALSE
## 6                         FALSE                     FALSE
##   Property_TypeFarm stay Property_TypeHouseboat Property_TypeHut
## 1                  FALSE                  FALSE            FALSE
## 2                  FALSE                  FALSE            FALSE
## 3                  FALSE                  FALSE            FALSE
## 4                  FALSE                  FALSE            FALSE
## 5                  FALSE                  FALSE            FALSE
## 6                  FALSE                  FALSE            FALSE
##   Property_TypePension Property_TypePrivate room
## 1                FALSE                     FALSE
## 2                FALSE                     FALSE
## 3                FALSE                     FALSE
## 4                FALSE                     FALSE
## 5                FALSE                     FALSE
## 6                FALSE                     FALSE
##   Property_TypePrivate room in bed and breakfast
## 1                                          FALSE
## 2                                          FALSE
## 3                                          FALSE
## 4                                          FALSE
## 5                                          FALSE
## 6                                          FALSE
##   Property_TypePrivate room in boat Property_TypePrivate room in bungalow
## 1                             FALSE                                 FALSE
## 2                             FALSE                                 FALSE
## 3                             FALSE                                 FALSE
## 4                             FALSE                                 FALSE
## 5                             FALSE                                 FALSE
## 6                             FALSE                                 FALSE
##   Property_TypePrivate room in condominium (condo)
## 1                                            FALSE
## 2                                            FALSE
## 3                                            FALSE
## 4                                            FALSE
## 5                                            FALSE
## 6                                            FALSE
##   Property_TypePrivate room in cottage Property_TypePrivate room in guest suite
## 1                                FALSE                                    FALSE
## 2                                FALSE                                    FALSE
## 3                                FALSE                                    FALSE
## 4                                FALSE                                    FALSE
## 5                                FALSE                                    FALSE
## 6                                FALSE                                    FALSE
##   Property_TypePrivate room in guesthouse Property_TypePrivate room in hostel
## 1                                   FALSE                               FALSE
## 2                                   FALSE                               FALSE
## 3                                   FALSE                               FALSE
## 4                                   FALSE                               FALSE
## 5                                   FALSE                               FALSE
## 6                                   FALSE                               FALSE
##   Property_TypePrivate room in loft Property_TypePrivate room in nature lodge
## 1                             FALSE                                     FALSE
## 2                             FALSE                                     FALSE
## 3                             FALSE                                     FALSE
## 4                             FALSE                                     FALSE
## 5                             FALSE                                     FALSE
## 6                             FALSE                                     FALSE
##   Property_TypePrivate room in rental unit
## 1                                    FALSE
## 2                                    FALSE
## 3                                    FALSE
## 4                                    FALSE
## 5                                    FALSE
## 6                                    FALSE
##   Property_TypePrivate room in residential home
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypePrivate room in serviced apartment
## 1                                           FALSE
## 2                                           FALSE
## 3                                           FALSE
## 4                                           FALSE
## 5                                           FALSE
## 6                                           FALSE
##   Property_TypePrivate room in tiny house
## 1                                   FALSE
## 2                                   FALSE
## 3                                   FALSE
## 4                                   FALSE
## 5                                   FALSE
## 6                                   FALSE
##   Property_TypePrivate room in townhouse Property_TypePrivate room in villa
## 1                                  FALSE                              FALSE
## 2                                  FALSE                              FALSE
## 3                                   TRUE                              FALSE
## 4                                   TRUE                              FALSE
## 5                                   TRUE                              FALSE
## 6                                   TRUE                              FALSE
##   Property_TypeRoom in aparthotel Property_TypeRoom in boutique hotel
## 1                           FALSE                               FALSE
## 2                           FALSE                               FALSE
## 3                           FALSE                               FALSE
## 4                           FALSE                               FALSE
## 5                           FALSE                               FALSE
## 6                           FALSE                               FALSE
##   Property_TypeRoom in hotel Property_TypeShared room
## 1                      FALSE                    FALSE
## 2                      FALSE                    FALSE
## 3                      FALSE                    FALSE
## 4                      FALSE                    FALSE
## 5                      FALSE                    FALSE
## 6                      FALSE                    FALSE
##   Property_TypeShared room in bed and breakfast
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypeShared room in boat Property_TypeShared room in boutique hotel
## 1                            FALSE                                      FALSE
## 2                            FALSE                                      FALSE
## 3                            FALSE                                      FALSE
## 4                            FALSE                                      FALSE
## 5                            FALSE                                      FALSE
## 6                            FALSE                                      FALSE
##   Property_TypeShared room in condominium (condo)
## 1                                           FALSE
## 2                                           FALSE
## 3                                           FALSE
## 4                                           FALSE
## 5                                           FALSE
## 6                                           FALSE
##   Property_TypeShared room in guest suite Property_TypeShared room in hostel
## 1                                   FALSE                              FALSE
## 2                                   FALSE                              FALSE
## 3                                   FALSE                              FALSE
## 4                                   FALSE                              FALSE
## 5                                   FALSE                              FALSE
## 6                                   FALSE                              FALSE
##   Property_TypeShared room in rental unit
## 1                                   FALSE
## 2                                   FALSE
## 3                                   FALSE
## 4                                   FALSE
## 5                                   FALSE
## 6                                   FALSE
##   Property_TypeShared room in residential home
## 1                                        FALSE
## 2                                        FALSE
## 3                                        FALSE
## 4                                        FALSE
## 5                                        FALSE
## 6                                        FALSE
##   Property_TypeShared room in serviced apartment
## 1                                          FALSE
## 2                                          FALSE
## 3                                          FALSE
## 4                                          FALSE
## 5                                          FALSE
## 6                                          FALSE
##   Property_TypeShared room in townhouse Property_TypeTiny house
## 1                                 FALSE                   FALSE
## 2                                 FALSE                   FALSE
## 3                                 FALSE                   FALSE
## 4                                 FALSE                   FALSE
## 5                                 FALSE                   FALSE
## 6                                 FALSE                   FALSE
##   Room_TypePrivate room Room_TypeEntire home/apt Rating
## 1                 FALSE                    FALSE  FALSE
## 2                 FALSE                    FALSE  FALSE
## 3                 FALSE                    FALSE  FALSE
## 4                 FALSE                    FALSE  FALSE
## 5                 FALSE                    FALSE  FALSE
## 6                 FALSE                    FALSE  FALSE
##   neighbourhood_cleansedEastern neighbourhood_cleansedIslands
## 1                         FALSE                         FALSE
## 2                         FALSE                         FALSE
## 3                         FALSE                         FALSE
## 4                         FALSE                         FALSE
## 5                         FALSE                         FALSE
## 6                         FALSE                         FALSE
##   neighbourhood_cleansedKowloon City neighbourhood_cleansedKwai Tsing
## 1                              FALSE                            FALSE
## 2                              FALSE                            FALSE
## 3                              FALSE                            FALSE
## 4                              FALSE                            FALSE
## 5                              FALSE                            FALSE
## 6                              FALSE                            FALSE
##   neighbourhood_cleansedKwun Tong neighbourhood_cleansedNorth
## 1                           FALSE                       FALSE
## 2                           FALSE                       FALSE
## 3                           FALSE                       FALSE
## 4                           FALSE                       FALSE
## 5                           FALSE                       FALSE
## 6                           FALSE                       FALSE
##   neighbourhood_cleansedSai Kung neighbourhood_cleansedSha Tin
## 1                          FALSE                         FALSE
## 2                          FALSE                         FALSE
## 3                          FALSE                         FALSE
## 4                          FALSE                         FALSE
## 5                          FALSE                         FALSE
## 6                          FALSE                         FALSE
##   neighbourhood_cleansedSham Shui Po neighbourhood_cleansedSouthern
## 1                              FALSE                          FALSE
## 2                              FALSE                          FALSE
## 3                              FALSE                          FALSE
## 4                              FALSE                          FALSE
## 5                              FALSE                          FALSE
## 6                              FALSE                          FALSE
##   neighbourhood_cleansedTai Po neighbourhood_cleansedTsuen Wan
## 1                        FALSE                           FALSE
## 2                        FALSE                            TRUE
## 3                        FALSE                            TRUE
## 4                        FALSE                            TRUE
## 5                        FALSE                            TRUE
## 6                        FALSE                            TRUE
##   neighbourhood_cleansedTuen Mun neighbourhood_cleansedWan Chai
## 1                          FALSE                          FALSE
## 2                          FALSE                          FALSE
## 3                          FALSE                          FALSE
## 4                          FALSE                          FALSE
## 5                          FALSE                          FALSE
## 6                          FALSE                          FALSE
##   neighbourhood_cleansedWong Tai Sin neighbourhood_cleansedYau Tsim Mong
## 1                              FALSE                               FALSE
## 2                              FALSE                               FALSE
## 3                              FALSE                               FALSE
## 4                              FALSE                               FALSE
## 5                              FALSE                               FALSE
## 6                              FALSE                               FALSE
##   neighbourhood_cleansedYuen Long host_response_timeN/A
## 1                           FALSE                 FALSE
## 2                           FALSE                 FALSE
## 3                           FALSE                 FALSE
## 4                           FALSE                 FALSE
## 5                           FALSE                 FALSE
## 6                           FALSE                 FALSE
##   host_response_timewithin a day host_response_timewithin a few hours
## 1                          FALSE                                FALSE
## 2                          FALSE                                FALSE
## 3                          FALSE                                FALSE
## 4                          FALSE                                FALSE
## 5                          FALSE                                FALSE
## 6                          FALSE                                FALSE
##   host_response_timewithin an hour host_acceptance_rate host_Superhost no_of_am
## 1                            FALSE                FALSE          FALSE    FALSE
## 2                            FALSE                FALSE          FALSE    FALSE
## 3                            FALSE                FALSE          FALSE    FALSE
## 4                            FALSE                FALSE          FALSE    FALSE
## 5                            FALSE                FALSE          FALSE    FALSE
## 6                            FALSE                FALSE          FALSE    FALSE
##   Amenities_Wifi Amenities_Shampoo Amenities_Kitchen Amenities_Long_Term
## 1          FALSE             FALSE             FALSE               FALSE
## 2          FALSE             FALSE             FALSE               FALSE
## 3          FALSE             FALSE             FALSE               FALSE
## 4          FALSE             FALSE             FALSE               FALSE
## 5          FALSE             FALSE             FALSE               FALSE
## 6          FALSE             FALSE             FALSE               FALSE
##   Amenities_Washer Amenities_HairDryer Amenities_HotWater Amenities_TV
## 1            FALSE               FALSE              FALSE        FALSE
## 2            FALSE               FALSE              FALSE        FALSE
## 3            FALSE               FALSE              FALSE        FALSE
## 4            FALSE               FALSE              FALSE        FALSE
## 5            FALSE               FALSE              FALSE        FALSE
## 6            FALSE               FALSE              FALSE        FALSE
##   Amenities_AC hv_email hv_phone hv_facebook hv_reviews hv_manual_offline
## 1        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 2        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 3        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 4        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 5        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 6        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
##   hv_manual_jumio hv_manual_off_gov hv_manual_gov hv_manual_work_email no_of_vf
## 1           FALSE             FALSE         FALSE                FALSE    FALSE
## 2           FALSE             FALSE         FALSE                FALSE    FALSE
## 3           FALSE             FALSE         FALSE                FALSE    FALSE
## 4           FALSE             FALSE         FALSE                FALSE    FALSE
## 5           FALSE             FALSE         FALSE                FALSE    FALSE
## 6           FALSE             FALSE          TRUE                FALSE    FALSE
##   Days_since_last_review Capacity_Sqr Beds_Sqr ln_Beds ln_Capacity ln_Rating
## 1                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 2                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 3                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 4                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 5                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 6                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
##   Shared_ind House_ind Private_ind Capacity_x_Shared_ind H_Cap P_Cap
## 1      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 2      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 3      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 4      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 5      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 6      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
##   ln_Capacity_x_Shared_ind ln_Capacity_x_House_ind ln_Capacity_x_Private_ind
## 1                    FALSE                   FALSE                     FALSE
## 2                    FALSE                   FALSE                     FALSE
## 3                    FALSE                   FALSE                     FALSE
## 4                    FALSE                   FALSE                     FALSE
## 5                    FALSE                   FALSE                     FALSE
## 6                    FALSE                   FALSE                     FALSE
##   reviews_since_2019 bookings_since_2019 nb_group_1 nb_group_2 nb_group_3
## 1               TRUE               FALSE      FALSE      FALSE      FALSE
## 2               TRUE               FALSE      FALSE      FALSE      FALSE
## 3               TRUE               FALSE      FALSE      FALSE      FALSE
## 4               TRUE               FALSE      FALSE      FALSE      FALSE
## 5               TRUE               FALSE      FALSE      FALSE      FALSE
## 6               TRUE               FALSE      FALSE      FALSE      FALSE
##   nb_group_4 nb_group_5
## 1      FALSE      FALSE
## 2      FALSE      FALSE
## 3      FALSE      FALSE
## 4      FALSE      FALSE
## 5      FALSE      FALSE
## 6      FALSE      FALSE
summary(backwardstep_leaps_nrt)$which
##   (Intercept) Reviews  Beds Capacity Monthly_Reviews
## 1        TRUE   FALSE FALSE    FALSE           FALSE
## 2        TRUE   FALSE FALSE     TRUE           FALSE
## 3        TRUE   FALSE FALSE     TRUE           FALSE
## 4        TRUE   FALSE FALSE     TRUE           FALSE
## 5        TRUE   FALSE FALSE     TRUE           FALSE
## 6        TRUE   FALSE FALSE     TRUE           FALSE
##   Property_TypeCasa particular Property_TypeEarth house
## 1                        FALSE                    FALSE
## 2                        FALSE                    FALSE
## 3                        FALSE                    FALSE
## 4                        FALSE                    FALSE
## 5                        FALSE                    FALSE
## 6                        FALSE                    FALSE
##   Property_TypeEntire bungalow Property_TypeEntire cabin
## 1                        FALSE                     FALSE
## 2                        FALSE                     FALSE
## 3                        FALSE                     FALSE
## 4                        FALSE                     FALSE
## 5                        FALSE                     FALSE
## 6                        FALSE                     FALSE
##   Property_TypeEntire condominium (condo) Property_TypeEntire guest suite
## 1                                   FALSE                           FALSE
## 2                                   FALSE                           FALSE
## 3                                   FALSE                           FALSE
## 4                                   FALSE                           FALSE
## 5                                   FALSE                           FALSE
## 6                                   FALSE                           FALSE
##   Property_TypeEntire guesthouse Property_TypeEntire hostel
## 1                          FALSE                      FALSE
## 2                          FALSE                      FALSE
## 3                          FALSE                      FALSE
## 4                          FALSE                      FALSE
## 5                          FALSE                      FALSE
## 6                          FALSE                      FALSE
##   Property_TypeEntire loft Property_TypeEntire place
## 1                    FALSE                     FALSE
## 2                    FALSE                     FALSE
## 3                    FALSE                     FALSE
## 4                    FALSE                     FALSE
## 5                    FALSE                     FALSE
## 6                    FALSE                     FALSE
##   Property_TypeEntire rental unit Property_TypeEntire residential home
## 1                           FALSE                                FALSE
## 2                           FALSE                                FALSE
## 3                           FALSE                                FALSE
## 4                           FALSE                                FALSE
## 5                           FALSE                                FALSE
## 6                           FALSE                                FALSE
##   Property_TypeEntire serviced apartment Property_TypeEntire townhouse
## 1                                  FALSE                         FALSE
## 2                                  FALSE                         FALSE
## 3                                  FALSE                         FALSE
## 4                                  FALSE                         FALSE
## 5                                  FALSE                         FALSE
## 6                                  FALSE                         FALSE
##   Property_TypeEntire vacation home Property_TypeEntire villa Property_TypeHut
## 1                             FALSE                     FALSE            FALSE
## 2                             FALSE                     FALSE            FALSE
## 3                             FALSE                     FALSE            FALSE
## 4                             FALSE                     FALSE            FALSE
## 5                             FALSE                      TRUE            FALSE
## 6                             FALSE                      TRUE            FALSE
##   Property_TypePrivate room in bed and breakfast
## 1                                          FALSE
## 2                                          FALSE
## 3                                          FALSE
## 4                                          FALSE
## 5                                          FALSE
## 6                                          FALSE
##   Property_TypePrivate room in condominium (condo)
## 1                                            FALSE
## 2                                            FALSE
## 3                                            FALSE
## 4                                            FALSE
## 5                                            FALSE
## 6                                            FALSE
##   Property_TypePrivate room in guest suite
## 1                                    FALSE
## 2                                    FALSE
## 3                                    FALSE
## 4                                    FALSE
## 5                                    FALSE
## 6                                    FALSE
##   Property_TypePrivate room in guesthouse Property_TypePrivate room in hostel
## 1                                   FALSE                               FALSE
## 2                                   FALSE                               FALSE
## 3                                   FALSE                               FALSE
## 4                                   FALSE                               FALSE
## 5                                   FALSE                               FALSE
## 6                                   FALSE                               FALSE
##   Property_TypePrivate room in hut Property_TypePrivate room in rental unit
## 1                            FALSE                                    FALSE
## 2                            FALSE                                    FALSE
## 3                            FALSE                                    FALSE
## 4                            FALSE                                    FALSE
## 5                            FALSE                                    FALSE
## 6                            FALSE                                    FALSE
##   Property_TypePrivate room in residential home
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypePrivate room in resort Property_TypePrivate room in ryokan
## 1                               FALSE                               FALSE
## 2                               FALSE                               FALSE
## 3                               FALSE                               FALSE
## 4                               FALSE                               FALSE
## 5                               FALSE                               FALSE
## 6                               FALSE                               FALSE
##   Property_TypePrivate room in serviced apartment
## 1                                           FALSE
## 2                                           FALSE
## 3                                           FALSE
## 4                                           FALSE
## 5                                           FALSE
## 6                                           FALSE
##   Property_TypePrivate room in tiny house
## 1                                   FALSE
## 2                                   FALSE
## 3                                   FALSE
## 4                                   FALSE
## 5                                   FALSE
## 6                                   FALSE
##   Property_TypePrivate room in townhouse Property_TypePrivate room in villa
## 1                                  FALSE                              FALSE
## 2                                  FALSE                              FALSE
## 3                                  FALSE                              FALSE
## 4                                  FALSE                              FALSE
## 5                                  FALSE                              FALSE
## 6                                  FALSE                              FALSE
##   Property_TypeRoom in aparthotel Property_TypeRoom in boutique hotel
## 1                           FALSE                               FALSE
## 2                           FALSE                               FALSE
## 3                           FALSE                               FALSE
## 4                           FALSE                               FALSE
## 5                           FALSE                               FALSE
## 6                           FALSE                               FALSE
##   Property_TypeRoom in hotel Property_TypeShared room
## 1                      FALSE                    FALSE
## 2                      FALSE                    FALSE
## 3                      FALSE                    FALSE
## 4                      FALSE                    FALSE
## 5                      FALSE                    FALSE
## 6                      FALSE                    FALSE
##   Property_TypeShared room in aparthotel
## 1                                  FALSE
## 2                                  FALSE
## 3                                  FALSE
## 4                                  FALSE
## 5                                  FALSE
## 6                                  FALSE
##   Property_TypeShared room in bed and breakfast
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypeShared room in boutique hotel Property_TypeShared room in hostel
## 1                                      FALSE                              FALSE
## 2                                      FALSE                              FALSE
## 3                                      FALSE                              FALSE
## 4                                      FALSE                              FALSE
## 5                                      FALSE                              FALSE
## 6                                      FALSE                              FALSE
##   Property_TypeShared room in hotel Property_TypeShared room in hut
## 1                             FALSE                           FALSE
## 2                             FALSE                           FALSE
## 3                             FALSE                           FALSE
## 4                             FALSE                           FALSE
## 5                             FALSE                           FALSE
## 6                             FALSE                           FALSE
##   Property_TypeShared room in rental unit
## 1                                   FALSE
## 2                                   FALSE
## 3                                   FALSE
## 4                                   FALSE
## 5                                   FALSE
## 6                                   FALSE
##   Property_TypeShared room in residential home
## 1                                        FALSE
## 2                                        FALSE
## 3                                        FALSE
## 4                                        FALSE
## 5                                        FALSE
## 6                                        FALSE
##   Property_TypeShared room in ryokan Property_TypeTiny house
## 1                              FALSE                   FALSE
## 2                              FALSE                   FALSE
## 3                              FALSE                   FALSE
## 4                              FALSE                   FALSE
## 5                              FALSE                   FALSE
## 6                              FALSE                   FALSE
##   Property_TypeTreehouse Room_TypePrivate room Room_TypeEntire home/apt Rating
## 1                  FALSE                 FALSE                    FALSE  FALSE
## 2                  FALSE                 FALSE                    FALSE  FALSE
## 3                   TRUE                 FALSE                    FALSE  FALSE
## 4                   TRUE                 FALSE                    FALSE  FALSE
## 5                   TRUE                 FALSE                    FALSE  FALSE
## 6                   TRUE                 FALSE                    FALSE  FALSE
##   neighbourhood_cleansedAkiruno Shi neighbourhood_cleansedAkishima Shi
## 1                             FALSE                              FALSE
## 2                             FALSE                              FALSE
## 3                             FALSE                              FALSE
## 4                             FALSE                              FALSE
## 5                             FALSE                              FALSE
## 6                             FALSE                              FALSE
##   neighbourhood_cleansedArakawa Ku neighbourhood_cleansedBunkyo Ku
## 1                            FALSE                           FALSE
## 2                            FALSE                           FALSE
## 3                            FALSE                           FALSE
## 4                            FALSE                           FALSE
## 5                            FALSE                           FALSE
## 6                            FALSE                           FALSE
##   neighbourhood_cleansedChiyoda Ku neighbourhood_cleansedChofu Shi
## 1                            FALSE                           FALSE
## 2                            FALSE                           FALSE
## 3                            FALSE                           FALSE
## 4                            FALSE                           FALSE
## 5                            FALSE                           FALSE
## 6                            FALSE                           FALSE
##   neighbourhood_cleansedChuo Ku neighbourhood_cleansedEdogawa Ku
## 1                         FALSE                            FALSE
## 2                         FALSE                            FALSE
## 3                         FALSE                            FALSE
## 4                         FALSE                            FALSE
## 5                         FALSE                            FALSE
## 6                         FALSE                            FALSE
##   neighbourhood_cleansedFuchu Shi neighbourhood_cleansedHachioji Shi
## 1                           FALSE                              FALSE
## 2                           FALSE                              FALSE
## 3                           FALSE                              FALSE
## 4                           FALSE                              FALSE
## 5                           FALSE                              FALSE
## 6                           FALSE                              FALSE
##   neighbourhood_cleansedHamura Shi neighbourhood_cleansedHigashikurume Shi
## 1                            FALSE                                   FALSE
## 2                            FALSE                                   FALSE
## 3                            FALSE                                   FALSE
## 4                            FALSE                                   FALSE
## 5                            FALSE                                   FALSE
## 6                            FALSE                                   FALSE
##   neighbourhood_cleansedHigashimurayama Shi neighbourhood_cleansedHino Shi
## 1                                     FALSE                          FALSE
## 2                                     FALSE                          FALSE
## 3                                     FALSE                          FALSE
## 4                                     FALSE                          FALSE
## 5                                     FALSE                          FALSE
## 6                                     FALSE                          FALSE
##   neighbourhood_cleansedItabashi Ku neighbourhood_cleansedKatsushika Ku
## 1                             FALSE                               FALSE
## 2                             FALSE                               FALSE
## 3                             FALSE                               FALSE
## 4                             FALSE                               FALSE
## 5                             FALSE                               FALSE
## 6                             FALSE                               FALSE
##   neighbourhood_cleansedKita Ku neighbourhood_cleansedKodaira Shi
## 1                         FALSE                             FALSE
## 2                         FALSE                             FALSE
## 3                         FALSE                             FALSE
## 4                         FALSE                             FALSE
## 5                         FALSE                             FALSE
## 6                         FALSE                             FALSE
##   neighbourhood_cleansedKoganei Shi neighbourhood_cleansedKokubunji Shi
## 1                             FALSE                               FALSE
## 2                             FALSE                               FALSE
## 3                             FALSE                               FALSE
## 4                             FALSE                               FALSE
## 5                             FALSE                               FALSE
## 6                             FALSE                               FALSE
##   neighbourhood_cleansedKomae Shi neighbourhood_cleansedKoto Ku
## 1                           FALSE                         FALSE
## 2                           FALSE                         FALSE
## 3                           FALSE                         FALSE
## 4                           FALSE                         FALSE
## 5                           FALSE                         FALSE
## 6                           FALSE                         FALSE
##   neighbourhood_cleansedKunitachi Shi neighbourhood_cleansedMachida Shi
## 1                               FALSE                             FALSE
## 2                               FALSE                             FALSE
## 3                               FALSE                             FALSE
## 4                               FALSE                             FALSE
## 5                               FALSE                             FALSE
## 6                               FALSE                             FALSE
##   neighbourhood_cleansedMeguro Ku neighbourhood_cleansedMinato Ku
## 1                           FALSE                           FALSE
## 2                           FALSE                           FALSE
## 3                           FALSE                           FALSE
## 4                           FALSE                           FALSE
## 5                           FALSE                           FALSE
## 6                           FALSE                           FALSE
##   neighbourhood_cleansedMitaka Shi neighbourhood_cleansedMusashimurayama Shi
## 1                            FALSE                                     FALSE
## 2                            FALSE                                     FALSE
## 3                            FALSE                                     FALSE
## 4                            FALSE                                     FALSE
## 5                            FALSE                                     FALSE
## 6                            FALSE                                     FALSE
##   neighbourhood_cleansedMusashino Shi neighbourhood_cleansedNakano Ku
## 1                               FALSE                           FALSE
## 2                               FALSE                           FALSE
## 3                               FALSE                           FALSE
## 4                               FALSE                           FALSE
## 5                               FALSE                           FALSE
## 6                               FALSE                           FALSE
##   neighbourhood_cleansedNerima Ku neighbourhood_cleansedNishitokyo Shi
## 1                           FALSE                                FALSE
## 2                           FALSE                                FALSE
## 3                           FALSE                                FALSE
## 4                           FALSE                                FALSE
## 5                           FALSE                                FALSE
## 6                           FALSE                                FALSE
##   neighbourhood_cleansedOkutama Machi neighbourhood_cleansedOme Shi
## 1                               FALSE                         FALSE
## 2                               FALSE                         FALSE
## 3                               FALSE                         FALSE
## 4                               FALSE                         FALSE
## 5                               FALSE                         FALSE
## 6                               FALSE                         FALSE
##   neighbourhood_cleansedOta Ku neighbourhood_cleansedSetagaya Ku
## 1                        FALSE                             FALSE
## 2                        FALSE                             FALSE
## 3                        FALSE                             FALSE
## 4                        FALSE                             FALSE
## 5                        FALSE                             FALSE
## 6                        FALSE                             FALSE
##   neighbourhood_cleansedShibuya Ku neighbourhood_cleansedShinagawa Ku
## 1                            FALSE                              FALSE
## 2                            FALSE                              FALSE
## 3                            FALSE                              FALSE
## 4                            FALSE                              FALSE
## 5                            FALSE                              FALSE
## 6                            FALSE                              FALSE
##   neighbourhood_cleansedShinjuku Ku neighbourhood_cleansedSuginami Ku
## 1                             FALSE                             FALSE
## 2                             FALSE                             FALSE
## 3                             FALSE                             FALSE
## 4                             FALSE                             FALSE
## 5                             FALSE                             FALSE
## 6                             FALSE                             FALSE
##   neighbourhood_cleansedSumida Ku neighbourhood_cleansedTachikawa Shi
## 1                           FALSE                               FALSE
## 2                           FALSE                               FALSE
## 3                           FALSE                               FALSE
## 4                           FALSE                               FALSE
## 5                           FALSE                               FALSE
## 6                           FALSE                               FALSE
##   neighbourhood_cleansedTaito Ku neighbourhood_cleansedTama Shi
## 1                          FALSE                          FALSE
## 2                          FALSE                          FALSE
## 3                          FALSE                          FALSE
## 4                          FALSE                          FALSE
## 5                          FALSE                          FALSE
## 6                          FALSE                          FALSE
##   neighbourhood_cleansedToshima Ku host_response_timeN/A
## 1                            FALSE                 FALSE
## 2                            FALSE                 FALSE
## 3                            FALSE                 FALSE
## 4                            FALSE                 FALSE
## 5                            FALSE                 FALSE
## 6                            FALSE                 FALSE
##   host_response_timewithin a day host_response_timewithin a few hours
## 1                          FALSE                                FALSE
## 2                          FALSE                                FALSE
## 3                          FALSE                                FALSE
## 4                          FALSE                                FALSE
## 5                          FALSE                                FALSE
## 6                          FALSE                                FALSE
##   host_response_timewithin an hour host_acceptance_rate host_Superhost no_of_am
## 1                            FALSE                FALSE          FALSE    FALSE
## 2                            FALSE                FALSE          FALSE    FALSE
## 3                            FALSE                FALSE          FALSE    FALSE
## 4                            FALSE                FALSE          FALSE    FALSE
## 5                            FALSE                FALSE          FALSE    FALSE
## 6                            FALSE                FALSE          FALSE    FALSE
##   Amenities_Wifi Amenities_Shampoo Amenities_Kitchen Amenities_Long_Term
## 1          FALSE             FALSE             FALSE               FALSE
## 2          FALSE             FALSE             FALSE               FALSE
## 3          FALSE             FALSE             FALSE               FALSE
## 4          FALSE             FALSE             FALSE               FALSE
## 5          FALSE             FALSE             FALSE               FALSE
## 6          FALSE             FALSE             FALSE               FALSE
##   Amenities_Washer Amenities_HairDryer Amenities_HotWater Amenities_TV
## 1            FALSE               FALSE              FALSE        FALSE
## 2            FALSE               FALSE              FALSE        FALSE
## 3            FALSE               FALSE              FALSE        FALSE
## 4            FALSE               FALSE              FALSE        FALSE
## 5            FALSE               FALSE              FALSE        FALSE
## 6            FALSE               FALSE              FALSE        FALSE
##   Amenities_AC hv_email hv_phone hv_facebook hv_reviews hv_manual_offline
## 1        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 2        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 3        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 4        FALSE    FALSE    FALSE       FALSE       TRUE             FALSE
## 5        FALSE    FALSE    FALSE       FALSE       TRUE             FALSE
## 6        FALSE    FALSE    FALSE       FALSE       TRUE             FALSE
##   hv_manual_jumio hv_manual_off_gov hv_manual_gov hv_manual_work_email no_of_vf
## 1           FALSE             FALSE         FALSE                FALSE    FALSE
## 2           FALSE             FALSE         FALSE                FALSE    FALSE
## 3           FALSE             FALSE         FALSE                FALSE    FALSE
## 4           FALSE             FALSE         FALSE                FALSE    FALSE
## 5           FALSE             FALSE         FALSE                FALSE    FALSE
## 6           FALSE             FALSE         FALSE                FALSE    FALSE
##   Days_since_last_review Capacity_Sqr Beds_Sqr ln_Beds ln_Capacity ln_Rating
## 1                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 2                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 3                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 4                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 5                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 6                   TRUE        FALSE    FALSE   FALSE       FALSE     FALSE
##   Shared_ind House_ind Private_ind Capacity_x_Shared_ind H_Cap P_Cap
## 1      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 2      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 3      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 4      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 5      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 6      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
##   ln_Capacity_x_Shared_ind ln_Capacity_x_House_ind ln_Capacity_x_Private_ind
## 1                    FALSE                   FALSE                     FALSE
## 2                    FALSE                   FALSE                     FALSE
## 3                    FALSE                   FALSE                     FALSE
## 4                    FALSE                   FALSE                     FALSE
## 5                    FALSE                   FALSE                     FALSE
## 6                    FALSE                   FALSE                     FALSE
##   reviews_since_2019 bookings_since_2019 nb_group_1 nb_group_2 nb_group_3
## 1               TRUE               FALSE      FALSE      FALSE      FALSE
## 2               TRUE               FALSE      FALSE      FALSE      FALSE
## 3               TRUE               FALSE      FALSE      FALSE      FALSE
## 4               TRUE               FALSE      FALSE      FALSE      FALSE
## 5               TRUE               FALSE      FALSE      FALSE      FALSE
## 6               TRUE               FALSE      FALSE      FALSE      FALSE
##   nb_group_4 nb_group_5
## 1      FALSE      FALSE
## 2      FALSE      FALSE
## 3      FALSE      FALSE
## 4      FALSE      FALSE
## 5      FALSE      FALSE
## 6      FALSE      FALSE
summary(backwardstep_leaps_tpe)$which
##   (Intercept) Reviews  Beds Capacity Monthly_Reviews
## 1        TRUE   FALSE FALSE    FALSE           FALSE
## 2        TRUE   FALSE FALSE    FALSE           FALSE
## 3        TRUE   FALSE FALSE    FALSE           FALSE
## 4        TRUE   FALSE FALSE    FALSE           FALSE
## 5        TRUE   FALSE FALSE    FALSE           FALSE
## 6        TRUE   FALSE FALSE    FALSE           FALSE
##   Property_TypeEntire condominium (condo) Property_TypeEntire guest suite
## 1                                   FALSE                           FALSE
## 2                                   FALSE                           FALSE
## 3                                   FALSE                           FALSE
## 4                                   FALSE                           FALSE
## 5                                   FALSE                           FALSE
## 6                                   FALSE                           FALSE
##   Property_TypeEntire guesthouse Property_TypeEntire loft
## 1                          FALSE                    FALSE
## 2                          FALSE                    FALSE
## 3                          FALSE                    FALSE
## 4                          FALSE                    FALSE
## 5                          FALSE                    FALSE
## 6                          FALSE                    FALSE
##   Property_TypeEntire place Property_TypeEntire rental unit
## 1                     FALSE                           FALSE
## 2                     FALSE                           FALSE
## 3                     FALSE                           FALSE
## 4                     FALSE                           FALSE
## 5                     FALSE                           FALSE
## 6                     FALSE                           FALSE
##   Property_TypeEntire residential home Property_TypeEntire serviced apartment
## 1                                FALSE                                  FALSE
## 2                                FALSE                                  FALSE
## 3                                FALSE                                  FALSE
## 4                                FALSE                                  FALSE
## 5                                FALSE                                  FALSE
## 6                                FALSE                                  FALSE
##   Property_TypeEntire townhouse Property_TypeEntire villa Property_TypeMinsu
## 1                         FALSE                     FALSE              FALSE
## 2                         FALSE                     FALSE              FALSE
## 3                         FALSE                     FALSE              FALSE
## 4                         FALSE                     FALSE              FALSE
## 5                         FALSE                     FALSE              FALSE
## 6                         FALSE                     FALSE              FALSE
##   Property_TypePrivate room Property_TypePrivate room in bed and breakfast
## 1                     FALSE                                          FALSE
## 2                     FALSE                                          FALSE
## 3                     FALSE                                          FALSE
## 4                     FALSE                                          FALSE
## 5                     FALSE                                          FALSE
## 6                     FALSE                                          FALSE
##   Property_TypePrivate room in bungalow
## 1                                 FALSE
## 2                                 FALSE
## 3                                 FALSE
## 4                                 FALSE
## 5                                 FALSE
## 6                                 FALSE
##   Property_TypePrivate room in casa particular
## 1                                        FALSE
## 2                                        FALSE
## 3                                        FALSE
## 4                                        FALSE
## 5                                        FALSE
## 6                                        FALSE
##   Property_TypePrivate room in condominium (condo)
## 1                                            FALSE
## 2                                            FALSE
## 3                                            FALSE
## 4                                            FALSE
## 5                                            FALSE
## 6                                            FALSE
##   Property_TypePrivate room in guest suite
## 1                                    FALSE
## 2                                    FALSE
## 3                                    FALSE
## 4                                    FALSE
## 5                                    FALSE
## 6                                    FALSE
##   Property_TypePrivate room in guesthouse Property_TypePrivate room in hostel
## 1                                   FALSE                               FALSE
## 2                                   FALSE                               FALSE
## 3                                   FALSE                               FALSE
## 4                                   FALSE                               FALSE
## 5                                   FALSE                               FALSE
## 6                                   FALSE                               FALSE
##   Property_TypePrivate room in loft Property_TypePrivate room in minsu
## 1                             FALSE                              FALSE
## 2                             FALSE                              FALSE
## 3                             FALSE                              FALSE
## 4                             FALSE                              FALSE
## 5                             FALSE                              FALSE
## 6                             FALSE                              FALSE
##   Property_TypePrivate room in rental unit
## 1                                    FALSE
## 2                                    FALSE
## 3                                    FALSE
## 4                                    FALSE
## 5                                    FALSE
## 6                                    FALSE
##   Property_TypePrivate room in residential home
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypePrivate room in serviced apartment
## 1                                           FALSE
## 2                                           FALSE
## 3                                           FALSE
## 4                                           FALSE
## 5                                           FALSE
## 6                                           FALSE
##   Property_TypePrivate room in townhouse Property_TypeRoom in aparthotel
## 1                                  FALSE                           FALSE
## 2                                  FALSE                           FALSE
## 3                                  FALSE                           FALSE
## 4                                  FALSE                           FALSE
## 5                                  FALSE                           FALSE
## 6                                  FALSE                           FALSE
##   Property_TypeRoom in boutique hotel Property_TypeRoom in hotel
## 1                               FALSE                      FALSE
## 2                               FALSE                      FALSE
## 3                               FALSE                      FALSE
## 4                               FALSE                      FALSE
## 5                               FALSE                      FALSE
## 6                               FALSE                      FALSE
##   Property_TypeShared room in bed and breakfast
## 1                                         FALSE
## 2                                         FALSE
## 3                                         FALSE
## 4                                         FALSE
## 5                                         FALSE
## 6                                         FALSE
##   Property_TypeShared room in boutique hotel Property_TypeShared room in cave
## 1                                      FALSE                            FALSE
## 2                                      FALSE                            FALSE
## 3                                      FALSE                            FALSE
## 4                                      FALSE                            FALSE
## 5                                      FALSE                            FALSE
## 6                                      FALSE                            FALSE
##   Property_TypeShared room in condominium (condo)
## 1                                           FALSE
## 2                                           FALSE
## 3                                           FALSE
## 4                                           FALSE
## 5                                           FALSE
## 6                                           FALSE
##   Property_TypeShared room in hostel Property_TypeShared room in loft
## 1                              FALSE                            FALSE
## 2                              FALSE                            FALSE
## 3                              FALSE                            FALSE
## 4                              FALSE                            FALSE
## 5                              FALSE                            FALSE
## 6                              FALSE                            FALSE
##   Property_TypeShared room in rental unit
## 1                                   FALSE
## 2                                   FALSE
## 3                                   FALSE
## 4                                   FALSE
## 5                                   FALSE
## 6                                   FALSE
##   Property_TypeShared room in residential home
## 1                                        FALSE
## 2                                        FALSE
## 3                                        FALSE
## 4                                        FALSE
## 5                                        FALSE
## 6                                        FALSE
##   Property_TypeShared room in serviced apartment
## 1                                          FALSE
## 2                                          FALSE
## 3                                          FALSE
## 4                                          FALSE
## 5                                          FALSE
## 6                                          FALSE
##   Property_TypeShared room in tent Property_TypeTiny house
## 1                            FALSE                   FALSE
## 2                            FALSE                   FALSE
## 3                            FALSE                   FALSE
## 4                            FALSE                   FALSE
## 5                            FALSE                   FALSE
## 6                            FALSE                   FALSE
##   Room_TypePrivate room Room_TypeEntire home/apt Rating
## 1                 FALSE                    FALSE  FALSE
## 2                 FALSE                    FALSE  FALSE
## 3                 FALSE                    FALSE  FALSE
## 4                 FALSE                    FALSE  FALSE
## 5                 FALSE                    FALSE  FALSE
## 6                 FALSE                    FALSE  FALSE
##   neighbourhood_cleansed中正區 neighbourhood_cleansed信義區
## 1                        FALSE                        FALSE
## 2                        FALSE                        FALSE
## 3                        FALSE                        FALSE
## 4                        FALSE                        FALSE
## 5                        FALSE                        FALSE
## 6                        FALSE                        FALSE
##   neighbourhood_cleansed內湖區 neighbourhood_cleansed北投區
## 1                        FALSE                        FALSE
## 2                        FALSE                        FALSE
## 3                        FALSE                        FALSE
## 4                        FALSE                        FALSE
## 5                        FALSE                         TRUE
## 6                        FALSE                         TRUE
##   neighbourhood_cleansed南港區 neighbourhood_cleansed士林區
## 1                        FALSE                        FALSE
## 2                        FALSE                        FALSE
## 3                        FALSE                        FALSE
## 4                        FALSE                        FALSE
## 5                        FALSE                        FALSE
## 6                        FALSE                        FALSE
##   neighbourhood_cleansed大同區 neighbourhood_cleansed大安區
## 1                        FALSE                        FALSE
## 2                        FALSE                        FALSE
## 3                        FALSE                        FALSE
## 4                        FALSE                        FALSE
## 5                        FALSE                        FALSE
## 6                        FALSE                        FALSE
##   neighbourhood_cleansed文山區 neighbourhood_cleansed松山區
## 1                        FALSE                        FALSE
## 2                        FALSE                        FALSE
## 3                        FALSE                        FALSE
## 4                        FALSE                        FALSE
## 5                        FALSE                        FALSE
## 6                        FALSE                        FALSE
##   neighbourhood_cleansed萬華區 host_response_timeN/A
## 1                        FALSE                 FALSE
## 2                        FALSE                 FALSE
## 3                        FALSE                 FALSE
## 4                        FALSE                 FALSE
## 5                        FALSE                 FALSE
## 6                         TRUE                 FALSE
##   host_response_timewithin a day host_response_timewithin a few hours
## 1                          FALSE                                FALSE
## 2                          FALSE                                FALSE
## 3                          FALSE                                FALSE
## 4                          FALSE                                FALSE
## 5                          FALSE                                FALSE
## 6                          FALSE                                FALSE
##   host_response_timewithin an hour host_acceptance_rate host_Superhost no_of_am
## 1                            FALSE                FALSE          FALSE    FALSE
## 2                            FALSE                FALSE          FALSE    FALSE
## 3                            FALSE                FALSE          FALSE    FALSE
## 4                            FALSE                FALSE          FALSE    FALSE
## 5                            FALSE                FALSE          FALSE    FALSE
## 6                            FALSE                FALSE          FALSE    FALSE
##   Amenities_Wifi Amenities_Shampoo Amenities_Kitchen Amenities_Long_Term
## 1          FALSE             FALSE             FALSE               FALSE
## 2          FALSE             FALSE             FALSE               FALSE
## 3          FALSE             FALSE             FALSE               FALSE
## 4          FALSE             FALSE             FALSE               FALSE
## 5          FALSE             FALSE             FALSE               FALSE
## 6          FALSE             FALSE             FALSE               FALSE
##   Amenities_Washer Amenities_HairDryer Amenities_HotWater Amenities_TV
## 1            FALSE               FALSE              FALSE        FALSE
## 2            FALSE               FALSE              FALSE        FALSE
## 3            FALSE               FALSE              FALSE        FALSE
## 4            FALSE               FALSE              FALSE        FALSE
## 5            FALSE               FALSE              FALSE        FALSE
## 6            FALSE               FALSE              FALSE        FALSE
##   Amenities_AC hv_email hv_phone hv_facebook hv_reviews hv_manual_offline
## 1        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 2        FALSE    FALSE    FALSE       FALSE      FALSE             FALSE
## 3        FALSE    FALSE     TRUE       FALSE      FALSE             FALSE
## 4        FALSE    FALSE     TRUE       FALSE      FALSE             FALSE
## 5        FALSE    FALSE     TRUE       FALSE      FALSE             FALSE
## 6        FALSE    FALSE     TRUE       FALSE      FALSE             FALSE
##   hv_manual_jumio hv_manual_off_gov hv_manual_gov hv_manual_work_email no_of_vf
## 1           FALSE             FALSE         FALSE                FALSE    FALSE
## 2           FALSE             FALSE         FALSE                FALSE    FALSE
## 3           FALSE             FALSE         FALSE                FALSE    FALSE
## 4           FALSE             FALSE         FALSE                FALSE    FALSE
## 5           FALSE             FALSE         FALSE                FALSE    FALSE
## 6           FALSE             FALSE         FALSE                FALSE    FALSE
##   Days_since_last_review Capacity_Sqr Beds_Sqr ln_Beds ln_Capacity ln_Rating
## 1                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 2                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 3                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 4                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 5                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
## 6                  FALSE        FALSE    FALSE   FALSE       FALSE     FALSE
##   Shared_ind House_ind Private_ind Capacity_x_Shared_ind H_Cap P_Cap
## 1      FALSE     FALSE       FALSE                 FALSE FALSE FALSE
## 2      FALSE     FALSE       FALSE                 FALSE  TRUE FALSE
## 3      FALSE     FALSE       FALSE                 FALSE  TRUE FALSE
## 4      FALSE     FALSE       FALSE                 FALSE  TRUE FALSE
## 5      FALSE     FALSE       FALSE                 FALSE  TRUE FALSE
## 6      FALSE     FALSE       FALSE                 FALSE  TRUE FALSE
##   ln_Capacity_x_Shared_ind ln_Capacity_x_House_ind ln_Capacity_x_Private_ind
## 1                    FALSE                   FALSE                     FALSE
## 2                    FALSE                   FALSE                     FALSE
## 3                    FALSE                   FALSE                     FALSE
## 4                    FALSE                    TRUE                     FALSE
## 5                    FALSE                    TRUE                     FALSE
## 6                    FALSE                    TRUE                     FALSE
##   reviews_since_2019 bookings_since_2019 nb_group_1 nb_group_2 nb_group_3
## 1               TRUE               FALSE      FALSE      FALSE      FALSE
## 2               TRUE               FALSE      FALSE      FALSE      FALSE
## 3               TRUE               FALSE      FALSE      FALSE      FALSE
## 4               TRUE               FALSE      FALSE      FALSE      FALSE
## 5               TRUE               FALSE      FALSE      FALSE      FALSE
## 6               TRUE               FALSE      FALSE      FALSE      FALSE
##   nb_group_4 nb_group_5
## 1      FALSE      FALSE
## 2      FALSE      FALSE
## 3      FALSE      FALSE
## 4      FALSE      FALSE
## 5      FALSE      FALSE
## 6      FALSE      FALSE

#5 Lasso Regression

We now try Lasso Regression for our model.

set.seed(100)
# library(glmnet)

lasso_cv_sin= cv.glmnet(as.matrix(list_after_2019.sin_clean[,!names(list_after_2019.sin_clean) %in% c("earnings_since_2019")]),list_after_2019.sin_clean[,c("earnings_since_2019")],family="gaussian",alpha=1, nfolds=10)

lasso_cv_hkg= cv.glmnet(as.matrix(list_after_2019.hkg_clean[,!names(list_after_2019.hkg_clean) %in% c("earnings_since_2019")]),list_after_2019.hkg_clean[,c("earnings_since_2019")],family="gaussian",alpha=1, nfolds=10)

lasso_cv_nrt= cv.glmnet(as.matrix(list_after_2019.nrt_clean[,!names(list_after_2019.nrt_clean) %in% c("earnings_since_2019")]),list_after_2019.nrt_clean[,c("earnings_since_2019")],family="gaussian",alpha=1, nfolds=10)

lasso_cv_tpe= cv.glmnet(as.matrix(list_after_2019.tpe_clean[,!names(list_after_2019.tpe_clean) %in% c("earnings_since_2019")]),list_after_2019.tpe_clean[,c("earnings_since_2019")],family="gaussian",alpha=1, nfolds=10)

lasso_coef.sin <- coef(lasso_cv_sin,s=lasso_cv_sin$lambda.min)
lasso_coef.hkg <- coef(lasso_cv_hkg,s=lasso_cv_hkg$lambda.min)
lasso_coef.nrt <- coef(lasso_cv_nrt,s=lasso_cv_nrt$lambda.min)
lasso_coef.tpe <- coef(lasso_cv_tpe,s=lasso_cv_tpe$lambda.min)

We now check on the top/ bottom n coefficients scaled by their respective values.

returnDf <- function (model, input_coef) 
{
  feature_names <- all.vars(model$terms)
  sze = length(input_coef)
  lasso_coef_df <- data.frame(features = input_coef@Dimnames[[1]][1:sze], coefs = round(input_coef[1:sze],2)) %>% filter (coefs != 0 )
  return (lasso_coef_df)
}
lasso_coef_df.sin <- returnDf(lasso_cv_sin, lasso_coef.sin) %>% rename (sin_coefs = coefs)
lasso_coef_df.hkg <- returnDf(lasso_cv_hkg, lasso_coef.hkg) %>% rename (hkg_coefs = coefs)
lasso_coef_df.nrt <- returnDf(lasso_cv_nrt, lasso_coef.nrt) %>% rename (nrt_coefs = coefs)
lasso_coef_df.tpe <- returnDf(lasso_cv_tpe, lasso_coef.tpe) %>% rename (tpe_coefs = coefs)

lasso_coef_df <- full_join(lasso_coef_df.nrt, lasso_coef_df.tpe, on="features")
## Joining, by = "features"
lasso_coef_df <- full_join(lasso_coef_df, lasso_coef_df.sin, on="features")
## Joining, by = "features"
lasso_coef_df <- full_join(lasso_coef_df, lasso_coef_df.hkg, on="features")
## Joining, by = "features"
lasso_coef_normalised.sin <- returnDf(lasso_cv_sin, lasso_coef.sin) %>% mutate_each_(list(~scale(.) %>% as.vector), vars = "coefs")
## Warning: `mutate_each_()` was deprecated in dplyr 0.7.0.
## Please use `across()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
lasso_coef_normalised.hkg <- returnDf(lasso_cv_hkg, lasso_coef.hkg) %>% mutate_each_(list(~scale(.) %>% as.vector), vars = "coefs")
lasso_coef_normalised.nrt <- returnDf(lasso_cv_tpe, lasso_coef.tpe) %>% mutate_each_(list(~scale(.) %>% as.vector), vars = "coefs")
lasso_coef_normalised.tpe <- returnDf(lasso_cv_nrt, lasso_coef.nrt) %>% mutate_each_(list(~scale(.) %>% as.vector), vars = "coefs")
returnTopBtmCoefs <- function (lasso_coef_normalised, topn)
{
  return (lasso_coef_normalised %>% arrange(desc(coefs)) %>% top_n(topn) %>% mutate(coefs = round(coefs,2)))
}

paintGraphCoefs <- function(lasso_coef_top5, topn, city)
{
  top5_listings.fig <- plot_ly(
    x = lasso_coef_top5$features,
    y = lasso_coef_top5$coefs,
    type = "bar",
    text = lasso_coef_top5$coefs
  )
  top5_listings.fig <- top5_listings.fig %>% layout(title =paste("Top", topn," Features for", city), yaxis = list(title="Feature Weight (relative)"))
  top5_listings.fig
}

lasso_coef_top5.sin <- returnTopBtmCoefs(lasso_coef_normalised.sin, 5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_top5.sin, 5, "Singapore")
lasso_coef_top5.nrt <- returnTopBtmCoefs(lasso_coef_normalised.nrt, 5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_top5.nrt, 5, "Tokyo")
lasso_coef_top5.tpe <- returnTopBtmCoefs(lasso_coef_normalised.tpe, 5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_top5.tpe, 5, "Taipei")
lasso_coef_top5.hkg <- returnTopBtmCoefs(lasso_coef_normalised.hkg, 5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_top5.hkg, 5, "Hong Kong")
lasso_coef_btm5.sin <- returnTopBtmCoefs(lasso_coef_normalised.sin, -5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_btm5.sin, -5, "Singapore")
lasso_coef_btm5.nrt <- returnTopBtmCoefs(lasso_coef_normalised.nrt, -5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_btm5.sin, -5, "Tokyo")
lasso_coef_btm5.tpe <- returnTopBtmCoefs(lasso_coef_normalised.tpe, -5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_btm5.tpe, -5, "Taipei")
lasso_coef_btm5.hkg <- returnTopBtmCoefs(lasso_coef_normalised.hkg, -5)
## Selecting by coefs
paintGraphCoefs(lasso_coef_btm5.hkg, -5, "Hong Kong")

And in tabular form:

lasso_coef_top5.sin
##                   features coefs
## 1       Property_Type_Boat  5.42
## 2 Property_Type_Tiny house  0.49
## 3              (Intercept) -0.04
## 4                    H_Cap -0.08
## 5            hv_manual_gov -0.10
lasso_coef_top5.nrt
##                                features coefs
## 1                           (Intercept)  4.15
## 2                            nb_group_1  0.43
## 3 Property_Type_Entire residential home  0.28
## 4                     hv_manual_off_gov  0.27
## 5                                 H_Cap  0.16
lasso_coef_top5.tpe
##                               features coefs
## 1              Property_Type_Treehouse  7.16
## 2 Property_Type_Private room in resort  0.61
## 3           Property_Type_Entire villa  0.42
## 4            Property_Type_Entire loft  0.05
## 5                 host_acceptance_rate  0.04
lasso_coef_top5.hkg
##                                  features coefs
## 1 Property_Type_Private room in townhouse  1.42
## 2                              nb_group_1 -0.01
## 3                             (Intercept) -0.63
## 4                      reviews_since_2019 -0.77
lasso_coef_btm5.sin
##                                         features coefs
## 1 Property_Type_Private room in residential home -0.31
## 2                                   Amenities_AC -0.31
## 3                                     nb_group_2 -0.32
## 4                         Property_Type_Campsite -0.32
## 5                                 Amenities_Wifi -0.40
lasso_coef_btm5.nrt
##                                   features coefs
## 1 Property_Type_Entire condominium (condo) -0.14
## 2            Property_Type_Entire bungalow -0.22
## 3                                House_ind -0.23
## 4               Property_Type_Entire villa -1.22
## 5                                 hv_phone -4.32
lasso_coef_btm5.tpe
##                                   features coefs
## 1                               hv_reviews -0.22
## 2 Property_Type_Private room in guesthouse -0.23
## 3                Property_Type_Earth house -0.41
## 4       Property_Type_Entire vacation home -0.49
## 5                              (Intercept) -0.84
lasso_coef_btm5.hkg
##                                  features coefs
## 1 Property_Type_Private room in townhouse  1.42
## 2                              nb_group_1 -0.01
## 3                             (Intercept) -0.63
## 4                      reviews_since_2019 -0.77

A quick check on the mean-squared errors for the respective \(\lambda\) values.

plot(lasso_cv_sin)

plot(lasso_cv_hkg)

plot(lasso_cv_nrt)

plot(lasso_cv_tpe)

This concludes our notebook.